Original Source Here
Comparing AI & Humans in Decision Making
Artificial intelligence is all the rage right now. Everything suddenly appears to have either a ChatGPT or AI component. One element that is getting overlooked is how the underlying models fueling these AI features are being trained. While most in the technology sector have heard of the GIGO (Garbage In, Garbage Out) principle, the reality is that the models behind machine learning are only as smart as the training data. This data is typically assembled with descriptive (identifying factual features) or normative (evaluating if a condition, like a rule, has been violated) labels. Does the training data used in an ML model influence a machine’s ability to replicate human decisions? This is what a team of researchers from MIT and the University of Toronto sought to investigate.
In their paper, published recently in the journal Science Advances, the research team details how 3,373 human participants were asked to provide labels for four unique data sets across three conditions. Each condition asked human labelers to determine if an object violated a rule (like clothing violating a dress code or a dog violating an aggressive pet policy). Participants were randomly assigned to provide descriptive or normative labels. However, descriptive labelers did not know the policy in question. Normative labelers had to read the policy and determine if the object violated the policy. In addition to the human labelers, machine learning models were also trained to compare humans and machines.
In their results, the researchers found the machines did not accurately replicate the policy violation decisions made by humans. In their analysis, the researchers found that the data type played a role in the discrepancy. The machine models that relied on descriptive label training data over predicted rule violations and had harsher judgments. Normative labels were also inaccurate, but the inaccuracy was less abundant. Interestingly, when the researchers told the human labelers that their output would be used to form a judgment, the accuracy between machine and human grew closer.
Findings replication is still necessary for the study. The implications of these findings are substantial. While most working on AI or ML will concede that humans offer biased opinions, these data suggest not only the data type matter. But, providing the proper context to the labelers also requires consideration. However, as this study suggests, the outcome will follow suit if the training data is garbage.
This was Article 212 from the Studio Quick Facts Series.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot