GitHub’s Commercial AI Tool Copilot Facing Criticism From Open-Source Community For Blind Copying Of Blocks Of Code

I feel like someone doesn't understand how machine learning works. You have to feed it lots and lots of training data in order for it to be able to figure out the solutions to problems it's never seen before. When you ask it to generate an answer to something, it uses all that past data to generate a new answer.

Sometimes, there's not enough training data to answer a certain question correctly. Other times, even with lots of data, it's entirely possible for all of its training data to point to the same solution.

But in ever case, it's still synthesizing code. It just happens that the code it spits out in a small selection of cases looks a lot like a copy-paste, because that's what the AI thinks is the best answer based on its training data.

So I wouldn't say it's copying code in any capacity. Unless you count typing out print("Hello world!") to be copying from Wikipedia?

/r/coding Thread Parent Link - theinsaneapp.com