Jake self
Class of '20

Bio
Hey there, my name is Jake, I'm a Sophomore at Laguna Blanca School, and I'm happy to get the chance to expand my knowledge in the STEM fields in this two-year elective we've been offered. I personally am happy that I have the wonderful opportunity to learn past what we are taught in Chemistry, Biology, Physics, etc., and learn more about different fields of applied science.
​
I am interested in Data Science, and particularly how it applies to government and business, and have conducted both in and out of class research on the topic. I've recently written a research paper discussing Data Science and its impact on the everyday internet browser, and educating the every day person on the basics of Data Mining.
​
The Information Treasure Trove:
Data Mining and its Effects on the Internet User
Right now, at this very moment, billions of people are being watched by their computers. Dozens of corporations big and small across the planet are spending hundreds of thousands of dollars to observe and record the google searches, conversations, and pictures taken. The countless people who had been warned were only done so by a few lines in the end user license agreement which they had “read” in two and a half seconds. This “Data Mining” is used by businesses of all kinds to identify people who would be interested in their products and target them individually with their advertisements. This way, advertisers know who exactly to target, and can advertise more efficiently for less money. If one were to look up online the names and prices of various guns and weapons enough times, then they would begin to receive countless gun advertisements on every website they visit. How does this happen? Because Google and companies like it sell the search history of their users to the people, who can use it to advertise.
​
The art of “Data Mining” is a very old one. For time immemorial vendors and businessmen have wanted to know what the people they are selling to enjoy. Vendors would like to see who in the town loves their product and who doesn’t, this way they can focus directly on the people who are more likely to buy their product. In the past, finding out who liked what was extremely difficult. One would have to ask around to find out who wants what, and risk being thought of as overly-intrusive or impolite. However, with the invention of the internet, Data Mining has been taken to the next level. Instead of asking the people directly what they like or are interested, companies and businesses can easily buy this information from the internet service providers, who will sell off the search information of their users. This heightened form of data collection and vending can be seen as both extraordinarily intrusive and extremely lucrative.
​
There are dozens of varied applications for Data Mining, ranging from the targeted advertisement described previously to the creation of predictive algorithms. The uses of a massive amount of raw data are almost limitless, and so companies will pay a high price to obtain this kind of information. The buying and selling of this data are hugely lucrative for both the buyer and the seller and is the source of much of the money made via the internet. This vast, unregulated information marketplace often conflicts with what many would perceive as ethical standards. The collection of people’s data happens on a frequent, day-to-day basis. Every social media post you’ve ever made, every email you’ve ever sent, and every google search you’ve ever made have all been recorded and sold.
Google itself has a convenient, user-friendly page on which it explains to the reader the broadest categories of data it collects, it reads, “The three main kinds of data we collect: Things you do… Things you create… Things that make you ‘you.’” It then goes into how Google’s data collection is just to “help the user” and make their services “faster, smarter, and more useful.” It goes into how they don’t sell your personal information, but they do sell the data they collect, and how their business with advertisers works on a most basic level. As if all-encompassing data collection and vending on the part of Google didn’t already take and sell enough, companies like Facebook or Snapchat cover the rest of the bases, Snapchat writing in its privacy policy, “we may—with your consent—collect information from your device’s phone book… Your IP address… Pages you visited before or after you navigated to our website.” It also helps when they let the reader know that all of this information is on the market and being sold to “third parties.” Third parties include advertisers, other data collectors, and the government (and not only the US government I might also add). If someone were to Snapchat a picture of their social security number, it’d be sold; if someone were to message a friend about something personal directly, it’d sell immediately. Other Snapchat users may not know what the user discussed with their friend, but every government and business in the world sure do.
Now apparently, these are not the only ways your data is used by businesses and government around the world, as sometimes their intent is less menacing, using data mining to discover fraud, scams, and ironically, privacy intrusion. For example, according to a Mashable article by Leah Betancourt, “Lending Club makes sure the user’s information checks out to try to protect his or her identity, according to Garcia. So they will compare application information from a credit file against information that’s publicly available. He said that if there’s a mismatch, it gives them more reason to go to more strict identification procedures.” This use of Data Mining, one not based off of the selling of the user's data, but rather the protection of it, is rare but useful. Companies can find a pattern of their client's behaviors, and measure it up to their future activities to ensure that they’re behaving normally. For example, if someone were to use an individual credit card to buy various household items on Amazon at a particular address, and then suddenly begins to purchase firearms from some other website across the world, the credit card company would be able to shut down the credit card and contact the card's owner.
Data collection has also been used
for research purposes, for with the
internet and this kind of intrusive
Data Mining becoming a common
practice, more and more information
has become available to the world.
Data mining corporations have
created massive databanks of a
wide. range of random statistics.
Ranging from the number of people
who drive individual cars to the
weather in millions of cities across the world, to the daily schedules of billions of people, every bit of raw
societal data you could imagine is held by these corporations, and is extremely useful for scientists. Predictive modelinghas also become extremely popular among Data Scientists, as broad trends that have been recorded for years tend to be able to be modeled with ease. Data Mining is what enables this kind of research.
There remains the question, however, of where the acquisition of people's
information will lead in the future. Surely it has proven to be useful in many
fields of research, identity protection, and the development of prediction algorithms, but it has also resulted in an unsettling amount of our personal information making its way out into the world. It does seem, if the current trend continues, that data collection by large corporations will only grow to become more and more intrusive with an even lesser amount of regulation. Internet neutrality and other “anonymous internet” movements have recently been struck down by a string of Supreme Court rulings and legislation. With the United States Government keeping a very lax policy towards internet regulation, companies have been able to get away with large-scale Data Mining operations with little to no government oversight. The most the government has done to even getting close to regulating internet-based companies are requiring these corporations to have a Privacy Policy, and all that involves is the company telling the user just how much of their information is sold. With data collection growing more and more common and accurate on the internet, it seems more and more attractive to companies of all kinds across the world.
Overall, it seems like the acceptability of Data Mining falls to its victims to decide. There is a good deal of people across the world who openly accept and even are thankful that their information is taken and sold to advertisers. They enjoy the luxury of being surrounded by ads for things they enjoy. Some people would gladly give up their phone numbers and conversations and google searches just so that they can get a few exciting ads and emails from corporations they are interested in. However, most value their personal information to be kept private. Many people do not want Google or Amazon to know what they said to their friend in an email, or what websites they had visited, or really anything that shouldn’t concern Google or Amazon. Many see this as a kind of “Big Brother” action on the part of businesses, and that the companies have no right to the information of the people.
The terms and services and privacy policies of businesses have sometimes been merely neglected, however, and the personal information of clients have been taken for profit or politics. One such example of this occurred when an employee from Cambridge Analytica informed the press that the company was working with Facebook to collect and catalog the private information of users. This information collected, including liked pages, the knowledge of friend accounts, and private messages have been taken by Cambridge Analytica, which gave this information to members of the Trump presidential campaign. Not only has this revelation significantly impacted people’s trust of social media corporation such as Facebook, but it has also sparked conflict among lawmakers in the United States, and the concept of nationalizing the business has come onto the table. With the lack of new legislation for the internet in the United States, companies tend to have free
reign. But when a company like this begins to dramatically impact the elections
and the governing of the United States, the government has to take actions.
With no laws established in this area at this point, however, we do not know
what the United States government will do in response.
We do, however, have the ability to keep our personal information private,
though it can be challenging. There are internet browsing companies that don’t
sell your information. There are email companies that don’t sift through the mess
-ages you send. There is the ability not to use social media. There are ad blocke
-rs, computer manufacturers, programs, and VPNs all over the internet that are
created for the sole purpose of either trying to limit the amount of personal inf-
-ormation taken by companies or just making the Data Mining less intrusive. Because this problem isn’t going to go away. This is a field in which the problem isn’t being solved, and a field in which the problem is only getting worse. The cost of the innovative research which has been enabled by Data Mining is the privacy, anonymity, and security of every internet user in the world.
"There are ad blockers, computer manufacturers, programs, and VPNs all over the internet that are
created for the sole purpose of either trying to limit the amount of personal information taken by companies or just making the Data Mining less intrusive."
"Lending Club makes sure the user's information 'checks out,' to try to protect his or her identity..."
Works Cited
​
“What Is Data Mining (Predictive Analytics, Big Data).” What Is Data Mining, Predictive
Analytics, Big Data, www.statsoft.com/Textbook/Data-Mining-Techniques.
“14 Useful Applications of Data Mining.” Big Data Made Simple, 23 June 2015,
bigdata-madesimple.com/14-useful-applications-of-data-mining/.
Betancourt, Leah. “How Companies Are Using Your Social Media Data.” Mashable, Mashable,
2 Mar. 2010, mashable.com/2010/03/02/data-mining-social-media/#RwOg6EhfIEqG.
Stein, Joel. “Data Mining: How Companies Now Know Everything About You.” Time, Time
Inc., 10 Mar. 2011, content.time.com/time/magazine/article/0,9171,2058205,00.html.
“Google Privacy | Why Data Protection Matters.” Google, Google,
privacy.google.com/your-data.html.
“Privacy Policy.” Privacy Center – Snap Inc., Snapchat,