Tooploox CS and AI news 41

  • Scope:
  • Artificial Intelligence
  • Generative AI
Tooploox CS and AI news 41
Date: May 2, 2024 Author: Konrad Budek 4 min read

April has brought new models, starting from Tiny Language Models to large ones with ever larger ones on the way. There was also an exciting cooperation between Billie Eilish and Nicki Minaj and OpenAI’s new frontiers. 

There was also a breakthrough from Google, with the ability to extend the context window to infinity – at least according to the paper. 

04.02.2024 Billie Eilish and Nicki Minaj demand protection against AI

A group of over 200 musicians signed an open letter calling for the establishment of new frameworks and boundaries that will protect creative workers and artists from predatory AI usage. The letter underlines the necessity of preventing AI tools from replacing or undermining human songwriters. 

More information can be found in The Guardian.  

04.03.2024 Users can edit Dall-e images using ChatGPT after a recent update

Integrating DALL-e with ChatGPT makes the AI user experience smoother and more convenient. Also, with the help of ChatGPT, the user does not have to forge a perfect prompt for him or herself – as the AI provides support when crafting the query. According to TheVerge, users can now enter a loose prompt that is then pre-processed by the GPT model behind ChatGPT and provides DALL-e with more detailed instructions.

More can be found in TheVerge story

04.06.2024 OpenAI transcribed YouTube videos to train ChatGPT

According to information gathered by The Wall Street Journal and The New York Times, OpenAI has trained a special model called Whisper to transcribe millions of YouTube videos for the sake of collecting training data for its GPT 4 model. This move is considered to be at least “shady” regarding current copyright laws. 

More can be found in TheVerge

04.08.2024 Spotify lets users build playlists with AI prompts

Users of Spotify Premium can now build their own playlists using only a text prompt. The user enters a message, and the system prepares a curated list based on the information provided. The user can ask, for example, for “a good list for reading fantasy novels on autumn nights at home” or “some good hip-hop stuff from the 90s.” And the system provides the user with 30 songs that may fit their vision. 

More can be found in TheVerge.

04.09.2024 eBay adds AI-powered “Shop the Look” feature

This feature uses interactive hotspots to show similar items and outfit ideas, incorporating both pre-owned and luxury items that align with the user’s style. It becomes available to users who have viewed at least 10 fashion items in the past 180 days and is displayed on eBay’s homepage and fashion landing page. It was recently launched for the iOS app of eBay. 

Tooploox proudly contributed to the feature’s development. More information can be found on Techcrunch.

04.09.2024 Gemini 1.5 Pro can process audio

The model can now process audio files, like songs or spoken recordings, without the need to transcribe them in any way. Gemini Pro is a middle-weight model from Google, with Gemini Ultra being their biggest one. Yet, according to a statement from Google, Gemini Pro is performing better and delivers more accurate results than the Ultra model. 

More about the audio capabilities of the Gemini Pro model can be found in this story from TheVerge

04.12.2024 Google’s new technique gives LLMs infinite context 

A paper recently published by Google introduces Infini-attention, a technique that allows Language Models to extend their context window to basically up to a million tokens, while keeping the computation and memory costs at a stable level. 

The research paper includes experiments with up to a million tokens of context window length, yet, according to the research team, it is possible to extend it to infinity. This is made possible by adding a “compressive memory” component that stores tokens that extend the context window in a reduced memory consumption mode.

More can be found in the research paper

04.14.2024 OpenAI opens office in Japan

The company behind ChatGPT and Dall-e has launched its first Asian office in Tokyo, Japan. Tokyo was chosen due to its long tradition of people and technology working together. One of its first decisions was to launch a GPT language model optimized for the Japanese language to support local businesses and companies in building their AI-powered competitiveness. 

More can be found in the company’s announcement

04.15.2024 Adobe Premiere Pro is getting generative AI video tools

Adobe is developing a generative AI video model for its Firefly family, intended for use in Premiere Pro to enhance video editing capabilities. The new Firefly tools will enable users to generate video, add or remove objects using text prompts, and extend video clip lengths, similar to Photoshop’s Generative Fill feature. No specific release date has been announced for these new tools. Adobe has only indicated a rollout for some time within “this year.”

More can be found in the company’s press release.

04.15.2024 Hugging Face launches the Idefics2 vision-language model

The 8 Billion parameter model released by Hugging Face can digest a random (or ordered) number of texts and images and produce a text as a response. Thus, it is useful in entity recognition or in extracting information. The model can also handle complicated math equations from image input. 

As with all models from Hugging Face, Idefics2 is open source and can be modified and used for commercial and non-commercial purposes at will. 

More information about the model itself can be found on the Hugging Face Blog

04.16.2024 Stanford: AI surpasses humans on many fronts, but at what cost… 

According to a recent report by Stanford University, AI-based systems outperform humans in tasks like image classification, visual reasoning, and English understanding. On the other hand, mathematics and common sense reasoning are still areas where humans lead the way. 

The full report can be found through this link

04.18.2024 Meta releases Llama 3

The tech giant behind Facebook, Instagram, and Threads has released a new version of its open-source language model, Llama. The tool comes in two versions – a smaller, 8-billion parameter and a larger, 70-billion parameter one. The company states that there is also a much larger 400-billion parameter model in production now, yet there is no information about an expected release date. According to the company, the model currently outperforms models from Google (Gemini Pro 1.5), Mistral (Mistral 7B), and Anthropic (Claude 3 Sonnet). 

More can be found in the announcement from Meta

04.22.2024 Stanford: More than half of US citizens use GenAI 

According to their report, 82% of respondents view AI as a tool that enhances creativity and simplifies lives. Also, 41% of those who use gen AI engage with it daily. Usage varies by purpose, with 81% utilizing it for personal projects, 30% for work, and 17% for school. 

More on the matter can be found in the full study

04.23.2024 Microsoft releases the Phi-3 Tiny Language Model

A relatively small (3 billion parameter) model, much smaller than the majority of popular models used today, has shown very satisfying performance levels. According to Sebastien Bubeck, Vice President of Microsoft GenAI, the system can deliver results comparable to GPT3.5 while being way less expensive and resource-consuming. 

More information can be found in the company press release.

Similar Posts

See all posts