Previously, I shared a few articles on how to deploy open source large language models locally and their use cases, such as adding knowledge bases to local large models, API calls, etc.
During this process, some followers provided feedback that they didn’t know how to deploy the models or their computer configurations were not good enough. They wanted to know if there were other ways to use these open source large language models without local deployment. Today, I will share a free and barrier-free method to use various mainstream open source models and their APIs, allowing you to invoke these models in various third-party applications such as translation plugins, Dify workflows, conversation apps, etc. without deploying the models locally. This method can help save money in these scenarios.
Introducing Groq
The platform we will use today is called Groq.
Groq is an AI hardware cloud platform, similar to cloud computing platforms like Alibaba Cloud in China and Microsoft Azure, providing cloud servers or cloud computing services to different enterprises. Groq’s biggest feature is its extremely fast computing speed and very low operating cost. Compared with Nvidia’s GPUs, its inference speed is more than ten times faster, but the operating cost is less than 1/10.
Impressive Inference Speed
Let’s directly experience Groq’s inference speed. First, I’ll find an article, select all and copy it to Groq, and have it translate into English.
Using the Llama3 70B version as a demo, let’s look at its output speed. You can see that it takes about 3 to 4 seconds to translate an article into English for us, which is really fast.
The overall completion quality and speed are very good. Then let’s look at the upper right corner, which shows the consumption and speed. Our input is 1707 TOKENS, and the output is 1040 TOKENS. The total time spent is 3.93 seconds, which means it only took less than 4 seconds to translate this article for us.
Article Generation Example
Let’s demonstrate another case, asking it to write an article of about 1000 words on the topic “How AI Can Make Half a Year’s Effort Surpass Ten Years of Mediocrity”, and feel its speed again. This time we switch to a different model, say I change it to Llama3 8B with 8 billion parameters, it can be even faster. Let’s experience it, it’s already done, almost in an instant.
Free Access to Open Source Models
On the Groq platform, we can directly use these open source large models for free. In addition to direct online use, it also provides us with an API call method. Here we can click on Groq Cloud in the bottom menu, which provides us with a Playground and an API call interface, we can go directly to see.
First is the Playground, which is similar to the Playgrounds of other large model platforms. Let’s look directly at the right side, the models it supports, Google’s, Meta’s, and Mixtral’s, it supports them all. You can also see here that it supports free calls to OpenAI’s Whisper language model for speech transcription, which can also be used for free.
Invoking Open Source Model APIs in Third-Party Applications
Next, let’s share how to invoke the APIs of these open source large models in third-party applications. Whether it’s in browser plugins, desktop applications, or third-party cloud service platforms, like calling the APIs of open source models in Dify’s workflows.
We can see on the left menu, there is an API KEY, which means we can create API KEYs for free here and call them in third-party applications. Moreover, the API interface of the Groq platform is the same as OpenAI’s API interface. This also means that as long as it is a third-party application compatible with OpenAI’s API interface, it can also be compatible with Groq’s API interface.
Here, we can create an API key. After creation, we can obtain such a key. This key needs to be saved when you create it, because after you close the page for creating the key, there is no way to copy it later. Please pay a little attention to this.
Next, let’s call this API key in Dify for a simple test.
Instant reply, it’s really fast!
Usage Limits
Currently, the Groq platform is free of charge, but with some slight limitations. The API key we created here has a usage limit, which can be seen in the settings under limits.
This amount is simply inexhaustible for individual users.
Conclusion
This is today’s introduction to the Groq platform, which provides us with free online open source large models and free API interfaces for us to call in third-party applications. With Groq, you can easily access powerful AI capabilities without the complexity of local model deployment. Give it a try and see how it can enhance your projects and workflows!