Tech | Visa | Scholarship/School | Info Place

This week in AI: Let’s not forget the humble data annotator

Keeping up with a rapidly evolving industry like artificial intelligence is a daunting task. So until AI can do that for you, here’s a handy roundup of the latest stories in machine learning, as well as notable studies and experiments we didn’t cover ourselves.

This week on AI, I want to turn the spotlight to labeling and annotation startups—startups like Scale AI, which is reportedly in talks to raise new funding at a $13 billion valuation. Labeling and annotation platforms may not attract as much attention for new generative AI models as OpenAI’s Sora. But they are essential. Without them, modern AI models would arguably not exist.

The data on which many models are trained must be labeled. Why? Labels help the model understand and interpret the data during training. For example, labels for training an image recognition model might take the form of markers around objects, “bounding boxes,” or captions referring to each person, place, or object depicted in the image.

The accuracy and quality of labels significantly affect the performance and reliability of trained models. Annotation is a difficult task, requiring thousands to millions of labels for larger and more complex datasets.

So you might think that data annotators would be treated well, paid a living wage, and receive the same benefits that the engineers who build the models themselves enjoy. But the opposite is often true—a product of the cutthroat working conditions fostered by many annotation and labeling startups.

Companies like OpenAI, which has billions in the bank, rely on annotators in third-world countries who pay them as little as a few dollars an hour. Some of these annotators were exposed to highly disturbing content, such as graphic images, but did not have time off work (because they were often contractors) or access to mental health resources.

A great article in NY Mag in particular took the wraps off Scale AI, which is recruiting annotators in countries as far away as Nairobi and Kenya. Some tasks on Scale AI require the labeler to work for eight hours straight with no breaks, and only cost $10. These workers are subject to the whims of the platforms. Annotators sometimes go without receiving work for long periods of time, or are unceremoniously dropped from Scale AI—something that happened recently to contractors in Thailand, Vietnam, Poland, and Pakistan.

Some annotation and labeling platforms claim to offer “fair trade” work. In fact, they have made it a core part of their brand. But as MIT Technology Review’s Kate Kaye points out, there are no regulations on what ethical labeling efforts mean, just weak industry standards and companies’ own definitions that vary widely.

So what to do? Barring a huge technological breakthrough, the need to annotate and label AI training data is not going away. We can hope that platforms will regulate themselves, but the more realistic solution seems to be policy development. That’s a tricky prospect in itself – but I think it’s our best chance of making things better. Or at least start doing that.

Here are some other noteworthy AI stories from the past few days:

    • OpenAI built a voice cloner: OpenAI is previewing Voice Engine, a new artificial intelligence-driven tool it has developed that enables users to clone a voice from a 15-second recording of someone speaking. But the company has chosen not to release it widely (yet), citing the risk of misuse and abuse.
    • Amazon doubles down on Anthropic: Amazon invested another $2.75 billion in growing artificial intelligence power Anthropic, following an unfinished option last September.
    • Google.org launches accelerator: Google.org, the philanthropic arm of Google, is launching a new six-month, $20 million program to help fund nonprofit organizations developing technology that leverages generative artificial intelligence.
    • New model architecture: Artificial intelligence startup AI21 Labs has released Jamba, a generative artificial intelligence model that uses a novel model architecture – the state space model (SSM) – to improve efficiency.
    • Databricks launches DBRX: In other model news, Databricks this week released DBRX, a generative AI model similar to OpenAI’s GPT series and Google’s Gemini. The company claims it achieves state-of-the-art results on many popular artificial intelligence benchmarks, including some measuring inference.
    • Uber Eats and UK AI regulation: Natasha writes about how Uber Eats couriers are fighting AI bias, showing how hard-won justice under UK AI regulations is.
    • EU election security guide: The European Union on Tuesday released draft electoral security guidelines aimed at surrounding two dozen Regulated platform Digital Services Act, including guidance on preventing content recommendation algorithms from spreading AI-based disinformation (aka political deepfakes).
    • Grok upgrade: X’s Grok chatbot will soon get an upgraded underlying model, Grok-1.5, and all premium subscribers on X will gain access to Grok. (Grok was previously only available to X Premium+ customers.)
    • Adobe extends Firefly: This week, Adobe launches Firefly service, a set of over 20 new generative and creative APIs, tools, and services. It’s also launching custom models that allow businesses to fine-tune Firefly models based on their assets – part of Adobe’s new GenStudio suite.

More machine learning

how is the weather? Artificial intelligence is increasingly able to tell you this. A few months ago I noticed that there were some efforts being made on hourly, weekly, and century-scale forecasts, but like all things in AI, the field is evolving rapidly. The team behind MetNet-3 and GraphCast published a paper describing a new system called SEEDS for scalable Ensemble Envelope Diffusion Sampler.

Animation shows how more forecasts create a more even distribution of weather forecasts.

SEEDS uses diffusion to generate an “ensemble” of reasonable weather results for an area based on inputs (which may be radar readings or orbital images) much faster than physics-based models. With a larger set count, they can cover more edge cases (such as an event that occurs in only 1 of 100 possible scenarios) and have more confidence in more likely scenarios.

Fujitsu also hopes to better understand the natural world by applying AI image processing technology to underwater images and lidar data collected by underwater autonomous vehicles. Improving image quality will allow other less complex processes, such as 3D conversion, to better handle target data.

Image Source: Fujitsu

The idea is to build a “digital twin” of the waters that can help simulate and predict new developments. We’re still a long way from that goal, but you have to start somewhere.

In the LL.M., researchers found that they modeled intelligence through a simpler method than expected: a linear function. Frankly, the math is beyond me (vector stuff in many dimensions), but this MIT article makes it clear that the recall mechanism of these models is very… basic.

Although these models are very complex, non-linear functions trained on large amounts of data, and difficult to understand, sometimes they have very simple mechanisms inside them. This is an example,” said co-lead author Evan Hernandez. If you’re more technically oriented, check out the paper here.

One reason these models may fail is by not understanding context or feedback. Even a really competent LL.M. might not “get it” if you tell it your name is pronounced a certain way because they don’t actually know or understand anything. In situations where this might be important, like human-robot interaction, it might put people off if a robot did that.

Disney Research has been working on automated character interaction for a long time, and this name pronunciation and reuse paper only appeared a while ago. This may seem obvious, but it’s a clever approach to extract the phonemes when someone introduces themselves and encode them instead of just the written name.

Image Source: disney research

Finally, as AI and search increasingly overlap, it’s worth reassessing how these tools are used and whether this unholy combination poses any new risks. Safiya Umoja Noble has been an important voice in the field of artificial intelligence and search ethics for many years, and her perspective is always enlightening. She had a great interview with the UCLA News team about how her work has evolved and why we need to stay aloof when it comes to bias and bad habits in search.

#week #Lets #forget #humble #data #annotator

Leave a Reply

Your email address will not be published. Required fields are marked *