Data Ethics and AI

In the data school we had an introduction and awareness into the ethics of data. Not really giving the topic much thought until today, I had quite a bit of surprising information come to my attention. Regarding data, there is much to think about when working with different datasets and where that data has been sourced as well as collected. How we as consultants are using that data without adding our biases when creating charts. Before that, not excluding relevant data from our pipelines when creating workflows. When working with the client data to do so ethically – “be good and do good work” that’s the approach TIL takes. If we do find ourselves in an ethical dilemma to reach out to our large community at TIL and figure out ways to handle the situation or make the client aware of the situation. 

Towards the end of the day we were asked to find a current event that may have some ethical implications. Using our newfound knowledge of ethics in data, voice our take on what is ethical, what isn’t, and our concerns. The topic I chose to take a deeper dive into was generative artificial intelligence. I found a great article on zero hedge from Tyler Durden titled: “9 Problems with Generative AI, In One Chart” (link below). In the article they talk about the potential problems with AI that may arise as well as issues already brought to light in these early stages. An issue that has been brought to light that I feel is important to talk about is the problem with copyright. Where does AI find the information when you ask it a question? From my understanding the technology accesses all public information on the internet and data sources and using an algorithm can then present to you their findings based on the topic you present. A report from Copyleaks stated that 60% of OpenAI’s GPT 3.5 outputs had contained some form of plagiarism. This is an alarming percentage. With new versions coming out these numbers should be reduced since the algorithm is improving.

After some time has passed from starting this blog (~2 months ago), more versions have come out with better ethics attached to them. Not saying concerns have been eliminated, but improvements have been made. When we discuss AI and the ethics behind the technology there are some topics that come up. We talked about copyright, but there are also biases, lack of trust, privacy, transparency, discrimination for a few. A major concern is that there has been no comprehensive federal regulations or legislation in the development of AI in the United States. There has been proposed federal laws as well as the introduction of bills from state legislatures. Another issue arising while trying to create these laws is that the definition of AI is a bit ambiguous. Through the state bills passed the states seem to have different definitions. With all this said it would seem the companies and government have a way to go until these systems are completely ethical or even at an acceptable level. Yet the companies continue to roll out new versions as well as new platforms… Will continue to update this blog as more info comes out.

Links with more info:

9 Problems With Generative AI, In One Chart
In the rapidly evolving landscape of artificial intelligence, generative AI tools are demonstrating incredible potential. However, their potential for harm is also becoming more and more apparent…

https://www.whitecase.com/insight-our-thinking/ai-watch-global-regulatory-tracker-united-states#:~:text=Laws%2FRegulations%20directly%20regulating%20AI,AI%20albeit%20with%20limited%20application.

6 Critical – And Urgent – Ethics Issues With AI
AI has lost no time in unfolding its immense potential.
Author:
Dillon Thomas
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab