<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Ai-Api-Directory on KnightLi Blog</title>
        <link>https://knightli.com/en/tags/ai-api-directory/</link>
        <description>Recent content in Ai-Api-Directory on KnightLi Blog</description>
        <generator>Hugo -- gohugo.io</generator>
        <language>en</language>
        <lastBuildDate>Thu, 12 Feb 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://knightli.com/en/tags/ai-api-directory/index.xml" rel="self" type="application/rss+xml" /><item>
        <title>LLM API Landscape (Free and Cost-Effective Options)</title>
        <link>https://knightli.com/en/2026/02/12/llm-api-landscape-free-and-cost-effective/</link>
        <pubDate>Thu, 12 Feb 2026 00:00:00 +0000</pubDate>
        
        <guid>https://knightli.com/en/2026/02/12/llm-api-landscape-free-and-cost-effective/</guid>
        <description>&lt;h2 id=&#34;google-gemini-api-best-free-tier&#34;&gt;Google Gemini API (Best Free Tier)
&lt;/h2&gt;&lt;p&gt;To promote the Gemini lineup, Google currently provides one of the most generous free quotas.
Pricing/details: &lt;a class=&#34;link&#34; href=&#34;https://ai.google.dev/gemini-api/docs/pricing?hl=zh-cn&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://ai.google.dev/gemini-api/docs/pricing?hl=zh-cn&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Models: Gemini 3 Flash Preview, Gemini 2.5 Pro (as of 2026-02-12).
In general, the newest top-end Pro model may have tighter free limits, while many other models still provide free usage.&lt;/p&gt;
&lt;p&gt;Pros:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Even top-tier models may include free quota.&lt;/li&gt;
&lt;li&gt;Very large context window (1M+ tokens).&lt;/li&gt;
&lt;li&gt;Strong multimodal support (image/video input).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Cons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data privacy: free-tier inputs may be used by Google to improve models (use with caution in production).&lt;/li&gt;
&lt;li&gt;IP restrictions: strict regional policy; unsupported locations may hit &lt;code&gt;403&lt;/code&gt; or &lt;code&gt;User Location Not Supported&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;groq-speed-king&#34;&gt;Groq (Speed King)
&lt;/h2&gt;&lt;p&gt;Groq uses its self-developed LPU (Language Processing Unit) hardware and provides extremely fast inference.
Pricing/details: &lt;a class=&#34;link&#34; href=&#34;https://groq.com/pricing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://groq.com/pricing&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Models: GPT OSS / Kimi K2 / Llama 3,4 / Qwen3
Quota: No free tier, but relatively low price.&lt;/p&gt;
&lt;p&gt;Pros:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Very low latency; TTFT is often within 200ms.&lt;/li&gt;
&lt;li&gt;Great for real-time chat and voice assistants.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Cons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Model scope is mostly open-source models; no GPT-4 or Claude hosted directly.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;siliconcloud-strong-domestic-option&#34;&gt;SiliconCloud (Strong Domestic Option)
&lt;/h2&gt;&lt;p&gt;A fast-growing China-based inference platform that aggregates many high-quality domestic open-source models.
Pricing/details: &lt;a class=&#34;link&#34; href=&#34;https://siliconflow.cn/pricing&#34;  target=&#34;_blank&#34; rel=&#34;noopener&#34;
    &gt;https://siliconflow.cn/pricing&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Models: Qwen 2.5 (7B/14B/72B), DeepSeek-V2, Yi-1.5, Kimi K2.
Quota: Some models (for example Qwen 7B, GLM-4-9B) currently offer permanent free calls.&lt;/p&gt;
&lt;p&gt;Pros:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fast domestic connectivity.&lt;/li&gt;
&lt;li&gt;New domestic open-source models are usually available quickly.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Cons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Free access is mainly for smaller models.&lt;/li&gt;
&lt;li&gt;Top models (such as 72B / DeepSeek 236B) are usually paid.&lt;/li&gt;
&lt;/ul&gt;
</description>
        </item>
        
    </channel>
</rss>
