Social Media + Satellite Imagery = Predictive Power

By Marty Graham, Contributor

When Typhoon Haiyan closed in on the Philippines in November 2013, people were already heading to “strong buildings” from the far less sturdy neighborhood of Tacloban. Typhoons occur almost every year in this island nation, but this storm was one of the worst, with more widespread damage than had been predicted.

With the view from space blocked by the savage storm, disaster relief groups turned to Twitter to start identifying the worst damage and where help was most needed—where people may be trapped but alive, for example, and where injured victims were gathering.

Social media already had an established presence; at the time, the Philippines was the eighth most popular country in the world for Twitter-use, and was labeled by some as “the social media capital of the world.” Consequently, rescue groups were able to gather massive amounts of information from tweets in the early hours to tailor how resources were used and, as the storm cleared, were able to confirm the damage via satellite imagery.

The information on social media aligned with the satellite images, and made the multinational rescue and disaster relief response more precise and more quickly delivered than previously possible.

What disaster responders found makes planners wonder if the unstructured, incidental communications on social media could become part of a reliable tool. What if there’s a way to fuse social media to images from space in real time, and use geographic mapping to make sense of vital information at real-time or near real-time speed? Could that provide reliable information to guide disaster relief, identify hidden nuclear facilities, track the spread of disease, and even watch social and political unrest turn into action?

While some of this information is already synthesized to a degree—a clutch of companies, for example, sell images to the news media after reporters learn of newsworthy problems via social media—no one has transformed the anecdotal blend of social media and imagery in a way that renders the fused sets of data statistically valid and accurate enough to generate predictive models. Researchers worldwide are working to develop a method that can be duplicated to fuse the discordant data.

This fusion has become more pertinent and more possible with the explosion of social media data: Twitter connects 330 million people; Facebook boasts about 2.4 billion users; and 2.7 billion log into its apps like Instagram and WhatsApp. Meanwhile, the availability of satellite imagery is also expanding exponentially, with almost 5,000 satellites now circling the earth. The amount, frequency, and quality of imagery will only continue to grow and improve as more satellites are planted in space.

Geography—geolocation in social media—is what makes the two sets seem like they should be used together, researchers say. By relating the data floods of social media and satellite imagery with the efficient tools of machine learning and the cloud’s capability to manage enormous amounts of data, the sorting and analysis of these massive data sets is becoming possible—and fusing them effectively an emerging area of study.

Floods of Fresh Data

What makes fusing this data particularly compelling is how the strength of one is the weakness of the other: Social media offers real-time insight at lightning speed, but requires verification; satellite imagery, on the other hand, isn’t bursting with of-the-moment intel, but is usually reliable and accurate, and ties to what we already know from earlier images.

Hootsuite, in its 2019 annual social media report, estimates that 57 percent of people in the world are connected to the internet—about 4.4 billion individuals, and that number continues to grow rapidly. But as users learn over and over again, not all messages are trustworthy. Ming-Hsiang Tsou, who heads the Center for Human Dynamics in the Mobile Age, estimates that about 30 percent of what’s on social media is generated by bots.

“We know social media has a lot of noise and bias,” Tsou says. “So we try to identify and get rid of noise with a machine learning algorithm.”

One of Tsou’s efforts is focused on how people behave during disasters—whether they evacuate from fire zones, for example—and he has honed in on social media that includes geographic locations.

“About 1 percent of social media is geolocated, but there’s so much of it that 1 percent is an effective sample,” he says.

Geolocated social media posts can provide enormous amounts of information that’s vital in a disaster, he continues. Pictures of floods and fires, for instance, show how fast and far they’ve spread.

“How many people will actually evacuate?…We can use social media to know dynamic movement throughout the day.”

—Ming-Hsiang Tsou, the Center for Human Dynamics in the Mobile Age

“How many people will actually evacuate? We saw that only 50 percent did—that can help us plan what disaster relief is needed and where,” he says. “If you try to use census data, you realize that its data is limited to night time; it’s [designed to document] where people live, but people don’t stay home all day. We can use social media to know dynamic movement throughout the day.”

Narrowed down to geolocated messages, there’s still a lot of data. For example, during the August 2018 Ranch fire in Northern California, Facebook kept anonymized data from people who were under mandatory evacuation orders. Researchers expected to see predictable movement out of the evacuation zones to shelters, but instead found movement to be nonlinear and unpredictable. The evacuees’ route of return into the area once the evacuation was lifted was predictable, however.

“This research demonstrates that social network data can be a valuable tool to monitor human behaviors in response to disasters…”

—Son Nghiem, remote sensing expert, NASA’s Jet Propulsion Laboratory

“This research demonstrates that social network data can be a valuable tool to monitor human behaviors in response to disasters, such as wildfires in areas that have been exacerbated by urbanization,” says Son Nghiem, remote sensing expert at NASA’s Jet Propulsion Laboratory, who oversaw this research.

Han Wang, a computer science professor at Texas A&M University, is part of a team that fused the disparate data sets for disasters. They taught their algorithm to grab onto posts containing key words like “be careful” and the heartbreaking “take care of my kids.”

“Hashtags matter. You see words like fire, flooding, hurricane, shelter. From those messages, we can have real-time information about where the flood is and how fast it is spreading.”

—Professor Han Wang, Texas A&M University

“We use simple keywords to sort out the irrelevant messages,” Wang says. “Hashtags matter. You see words like fire, flooding, hurricane, shelter. From those messages, we can have real-time information about where the flood is and how fast it is spreading.”

Pictures Near and Far

An astronomical increase in satellite imagery is on the horizon, boosted by micro-satellites developed by companies including Draper, a not-for-profit research and development organization. Draper’s Kim Slater, systems manager and business lead for space innovations, says that thousands of new, small satellites will go into orbit in the next few years, and we already can get daily images from almost every place on earth. According to Pixalytics, observations satellites number just below communications satellites.

“In a year or two, we’ll get every place on earth at least every 10 minutes and in some places, every minute,” Slater says.

Draper is also learning to connect social media to satellite imagery—and, so far, can detect social unrest and the trajectory of disease spread, as well as how people are reacting to disasters.

“We can take Twitter feeds and satellite imagery, social norms and other data, and we can actually predict if one case of Zika is going to turn into 10 or 10,000.”

—Kim Slater, systems manager and business lead for space innovations, Draper

“Twitter feeds, anything on social media that shows people’s state of panic, is highly predictive about how diseases spread, for example,” she says. “We can take Twitter feeds and satellite imagery, social norms and other data, and we can actually predict if one case of Zika is going to turn into 10 or 10,000.”

Data is power, after all, and there’s an awful lot of information that can be used to identify and predict on social media that data scientists are just beginning to build a skill set to use. But not all social media data is in the form of words. Wang likes to tap into Flickr, the photo website many use to protect images, as well as share them. One important part of using images smartly is making sure there’s accurate historical data, Wang says. Tying the current information to earlier information from both social media and satellite images into the analysis can help sort out the junk; for example, a selfie that includes a street view from yesterday and the same street view with upended cars and smashed buildings confirms that a violent disaster has occurred. “Geolocation is more important than everything else,” Wang says. “People like to hide the valuable geo info and we want it [in order] to see how widespread their [disaster] issues are.”

The value of Wang and his colleagues’ work in this emerging effort isn’t that they’ve solved problems at a specific disaster, it’s that they’ve established a framework to refine and fuse data in future disasters.

“We think of satellite images as sensors in space,” Tsou says. “Now we’re learning to think of social media as sensors on humans.”