Google has a program in which they monitor occurrences of health-related search terms as an indicator of illness trends. As their website notes:
“there are more flu-related searches during flu season, more allergy-related searches during allergy season, and more sunburn-related searches during the summer”
Their self-reported indications of estimates of flu breakouts almost identically mimic those of the US Centers for Disease Control. In other words, analyzing the occurrence of specific sets of search terms provides nearly as accurate estimates of government agencies focused on public health.
This is an example of combining big data, crowd-sourcing, and predictive analytics to provide actionable intelligence regarding a public health threat. In effect, if monitoring for flu- (or other illness-) related searches can provide early indications of the geographic migration of infectious illnesses, local public health authorities can initiate preventive measures to reduce the impacts.
The public safety opportunity is tremendous, and this is an interesting and useful example of big data analytics. So far, so good – except for one thing: did you realize that your search terms were being used for this type of analysis?
Actually, if you read the Google privacy page, they tell you that you can look at the dashboard to see what they know about you. However, this is somewhat misleading; they will reflect back to you what they know about what you have admitted to be doing. More simply, they track what you do when you have logged into one of their sites (YouTube, Picasa, gmail, Google docs, and so on) and present back to you what you have done when you have logged on.
“We may collect device-specific information (such as your hardware model, operating system version, unique device identifiers, and mobile network information including phone number).”
That means that they will note the IP address (“unique device identifier”) as well as the machine you use at that IP address. From your employer’s location, it may be a little difficult to resolve each search to a particular desktop, but if you are using your home computer on your home internet connection, any time that you logged into one of your Google accounts and they note the “device-specific information,” that information is linked to your exposed identity. That means that even when you log off YouTube, Google can still figure out that some specific someone that they think they can identify is using their services (search, in particular) from the same machine.
The result is that not only are your search terms being collected for crowd-sourced analysis, those same terms are likely to be linked specifically to a single identity. That means they are watching what you do…
the Data Roundtable.