Teaching a computer to “think” the way the human brain does means feeding it huge amounts of real-world data so that it can learn, analyze, predict, and solve problems. But in this brave new world of artificial intelligence (AI) and machine learning, there are no ethical guidelines, no regulations, and no parameters to govern how this data is collected and used.
Artificial intelligence is a computer science branch that aims to develop computers that can learn and solve problems, much as a human brain does.
When BM’s AI supercomputer Watson is enlisted to help doctors tailor therapy to breast cancer patients, it needs to consume high volumes of medical data before it can provide better insights into personalized treatment options and their outcomes.
Every time you pick a Netflix movie or ask your digital voice assistant to call a friend, you are dealing with AI. Your personal habits, behavior, and information are tracked and noted so that the AI system has a pattern that it can use to determine the products and services that best suit your needs – or close.
“So on the one hand, the companies are saying ‘if you give us this data, we’ll give you better services, more personalized services,’” said Ben Lorica, Chief Data Scientist for O’Reilly Media, which provides technology and business training.
Machine Learning is an aspect of artificial intelligence. Its goal is to build machines that can interpret data so that they can learn to improve their performance, provide new insights, and solve problems.
“On the other hand,” he added “… it’s unclear how long they retain the data, who they share the data with and what kinds of privacy protection is placed around the data.”
People who are now using machine learning to analyze consumer and user behavior already have the data available to them. “What AI is opening up to them is a different set of techniques to analyze roughly the same data.”
While AI and machine learning tools don’t make it easier to collect data, they “dramatically change how the collected data is ‘used,’” said Cornell Tech Computer Science professor, Vitaly Shmatikov, in an email.
But in the absence of any regulations or ethical guidelines, he believes it is important to minimize this type of data collection, at least until there is better understanding of how these tools are being used, how they process data, and what they are extrapolating from it.
“We don’t yet fully understand what can be learned from the data,” he added. “Sometimes the data collected for one purpose [e.g., location data] reveals a lot of sensitive information about individuals, intentionally or unintentionally.”
Powerful data analysis technologies based on machine learning tend to reveal information about people “that may not be explicit in the data but can be inferred from it – social relationships, political and religious affiliation, sexual orientation, etc.,” he said. And they “may reveal a lot of information beyond the purpose for which it was collected.”
AI algorithms are predefined instructions that set a process in motion. An algorithm can help weed out fake stories or undesirable job applicants, for example, but depending on the quality of the instructions, the process could go wildly wrong.
AI algorithms similar to those used to track fake news, for example, are a particular concern because they may “make decisions and recommendations that do not meet our priorities and values as a society,” such as who gets hired or fired, who gets credit, or who gets a prison sentence.
“In these situations,” he added, “it is important to ensure that AI algorithms are fair, unbiased, accountable, and transparent.”
For now, technology companies are the ones leading the charge toward implementing machine learning, sometimes in partnership with other interested parties. But Lorica expects the field to open up significantly.
“You [will] see a wave of companies developing tools specifically for advertisers or e-commerce sites,” he said. “You just license the technology from these other companies. We’re also seeing … software-as-a-service, which makes it even easier for companies to use these tools. … We will see maybe the rise of a new set of companies who are selling tools that take advantage of these new techniques.”
Fortunately, there are now groups and data scientists that are aware of ethical concerns surrounding data use in machine learning. And a lot of companies are already discussing transparency and fairness, and ethics training for data processing and machine learning algorithms.
But Lorica believes more can be done. He suggests adding AI ethics to the data science curriculum in companies as they onboard a new set of data scientists. “Or in the case of universities, then maybe fairness and transparency in [machine learning] is an important part of the training.”