Public health officials are focusing on the 30% of the eligible population who remain unvaccinated against COVID-19 as of the end of October 2021, and this requires finding out where those people are and why they were not vaccinated Is.
People are denied vaccinations for a variety of reasons, including belief in unfounded conspiracy theories about disease, vaccines, or both; distrust of the medical establishment; Concerns about risks and side effects; fear of needles; and difficulty in accessing vaccines. To geographically target their messaging and outreach, and according to the type of hesitation, public health officials need good data to guide their efforts. Traditional survey methods are helpful but expensive.
Another approach is to assess vaccine hesitancy through the lens of social media. As an artificial intelligence researcher, I analyze social media data using machine learning. My latest research, conducted with graduate student Sarah Melotte and accepted for publication in the journal PLOS Digital Health, predicts the degree of vaccine hesitation at the zip code level in US metropolitan areas by analyzing geo-located tweets.
We found that by processing geo-located Twitter data using readily available machine learning techniques, we compared vaccines by zip code using ZIP code characteristics such as median home value and number of health care and social services facilities. Can estimate hesitation more accurately.
extent of survey
Surveys, such as the Gallup COVID-19 survey launched in 2020, estimate the level of vaccine hesitancy in the general population, with a representative sample voting yes/no with the vaccine hesitancy question: If a Food and Drug Administration Vaccine approved by to prevent Coronavirus / COVID-19 was just available for free, would you agree to the vaccination? Estimated vaccine hesitation is the percentage of individuals who respond “no”. As shown in both our research and the work of others, factors such as location, income and education level are all related to vaccine hesitancy.
A common disadvantage of such surveys is that detailed questions are expensive to administer. Sample sizes are small due to cost constraints and non-response rates. The latter has been intensified by recent political polarization. Computational social science methods, which use computer algorithms to analyze large amounts of data, are another option, but they can have trouble interpreting noisy social media text to gain insight.
Our work acknowledges the challenge of using publicly available Twitter data to accurately estimate vaccine hesitancy in a given ZIP code. We focused on ZIP codes in major metropolitan areas known for high Tweeting activity. Users enable GPS in these areas more often.
As a first step, we downloaded all tweets from a publicly available dataset called GeoCoV19, which filters tweets to be as relevant to COVID-19 as possible. Next, using a peer-reviewed methodology, we filtered tweets from the top metropolitan areas into GPS-enabled tweets. Then we randomly split the tweets into a training set and a test set. The former was used to develop the model, while the latter was used to evaluate the model.
Training a model to predict a zip code’s vaccine hitch is like drawing a straight line through a set of points so that the line is as close to the center of the points, known as the line of best fit. goes. The line indicates the trend in the data. The first step is to convert the raw text of the tweets into data points.
[The Conversation’s science, health and technology editors pick their favorite stories. Weekly on Wednesdays.]
Recently developed deep neural networks are able to automatically convert text into data points so that tweets with similar meaning are closer together. We essentially used such a network to convert our tweets into data points and then trained our machine learning models on those data points. We validated our model using Gallup COVID-19 survey results.
Our method performed better in predicting higher levels of vaccine hesitancy than methods that used only common features, such as median home prices within zip codes, rather than social media data. We also showed our model to be effective in the presence of tweets that are not related to vaccines or COVID-19. The GeoCov19 dataset is good but includes many tweets that are not specifically relevant to vaccines and a small – but non-trivial – piece that is not relevant to COVID-19 at all.
Early detection and prevention
In research currently undergoing peer review, we have developed algorithms that automatically detect potential causes and extent of vaccine hesitation from social media. Our preliminary analysis confirms that while some causes are the result of conspiracy theories and misinformation, others are informed by legitimate concerns such as potential vaccine side effects.
We expect that people with these concerns may be more responsive to vaccination if they are presented with reliable sources of information that assuage their fears. In the future, public health officials may use machine learning to quickly detect vaccine hesitation on social media. They could then use algorithms to automatically deliver targeted information and take offense against the spread of health misinformation.
Such futuristic digital public health systems could lead to healthier outcomes in both the physical and digital realms.