The BYOK Math Every Attorney, CPA, and Consultant Should Do in 2026

By Jameson Daines · May 12, 2026 · 8 min read

Two things happened in the last few months that changed how I think about AI tool spending. First, Cursor quietly killed BYOK support in late 2025. If you were using your own Anthropic or OpenAI key inside Cursor, that stopped working. The company needed subscription revenue, not API pass-through, and BYOK users were their most active users contributing zero dollars to the bottom line. So the feature went away.

Second, Google pulled Gemini Pro models off its free API tier on April 1, 2026. Gemini 3.1 Pro, 3 Pro, and 2.5 Pro all became paid-only. Flash sticks around with reduced quotas, but the era of using a flagship Pro model for free is over. Google also enforced mandatory monthly spending caps and rolled out prepaid billing for new accounts starting March 23.

These aren't isolated events. They're part of the same story: AI companies that started with open, flexible models are tightening up as they figure out their actual business. And for professionals who use AI for client work, that story has two sets of consequences: cost, and compliance.

So let me do both kinds of math. Most professionals I talk to have done neither, and the numbers on each are more interesting than you'd expect.

The compliance argument for BYOK that most articles skip

The typical BYOK conversation is purely about cost. I want to start somewhere different: for attorneys, CPAs, and consultants, BYOK isn't just a cheaper option. It's the architecture that keeps client information out of a third party's hands.

When you use a typical AI subscription tool, your prompts, which often contain client names, matter details, financial data, and privileged analysis, pass through that tool's infrastructure before reaching the underlying model. You're not just paying a markup on tokens. You're adding a data processor to the chain. That processor has its own privacy policy, its own data retention practices, and its own risk surface.

For attorneys, ABA Formal Opinion 512 (2024) on competent and ethical AI use requires attorneys to understand what cloud-based AI services do with client data and to take reasonable measures to protect confidentiality. The decision in United States v. Heppner (Judge Rakoff, S.D.N.Y., Feb. 17, 2026) added a sharper point: the court found that transmitting privileged communications through a third-party AI intermediary could constitute a voluntary disclosure sufficient to waive privilege. You don't have to intend to waive privilege. You just have to route client data through an intermediary you haven't adequately evaluated.

For CPAs and EAs, IRC §7216 makes it a criminal offense to knowingly or recklessly disclose or use tax return information for purposes other than tax preparation. The civil counterpart is IRC §6713, which carries a $250 penalty per disclosure. The FTC Safeguards Rule adds data security requirements for tax preparers. Using an AI tool that routes client tax data through a vendor's cloud infrastructure is a meaningful compliance question, not a hypothetical one.

With BYOK, your API key connects directly to Anthropic or OpenAI. The tool vendor, in this case Advisor Prep Hero, is not in the request path. Client data doesn't touch Advisor Prep Hero's servers. The data path is: your machine, your API key, the model provider's infrastructure. That's a chain you can actually describe to a client, a bar counsel, or an IRS auditor.

For a solo attorney, the question isn't just whether BYOK saves money. It's whether the alternative creates a disclosure you can't walk back. Those are different risk profiles.

What AI tools actually cost when you count everything

Now the cost argument, because it's also real.

The trap most professionals fall into is treating AI subscriptions like utilities: you pay the bill each month without thinking much about it. $20 here for Claude Pro, maybe $20 for ChatGPT Plus, a few dollars in API credits for something else. It adds up to $50-80/month and feels fine, vaguely.

Here's what API pricing actually looks like as of May 2026, per PEC Collective's pricing tracker:

Claude Sonnet 4.6: $3.00 per million input tokens, $15.00 per million output tokens
Claude Haiku 4.5: $1.00 per million input tokens, $5.00 per million output tokens
GPT-4.1 Mini: $0.40 per million input tokens, $1.60 per million output tokens
Gemini 3.1 Pro: $2.00 per million input tokens, $12.00 per million output tokens
GPT-4.1 Nano: $0.10 per million input tokens, $0.40 per million output tokens

A typical professional-level interaction, say drafting a client memo, working through a tax position analysis, or synthesizing deposition notes, runs maybe 2,000-5,000 input tokens and generates 800-2,000 output tokens. At Sonnet 4.6 rates, that's roughly $0.006 to $0.022 per session. Less than a cent or two.

If you're doing 20 meaningful AI work sessions a day, that's maybe $0.40/day. Under $15/month at Claude Sonnet rates. And Sonnet 4.6 is the serious, capable model, not a cheap alternative.

Compare that to a $20/month Claude Pro subscription. If you're doing 20 solid sessions a day, you're paying at parity or a slight premium for the subscription. Fair enough, subscriptions have convenience value. But if you're a lighter user doing 5-10 meaningful sessions a day, your real API cost might be $2-5/month. You're paying $20 for the subscription anyway. That's a 4-10x markup for the convenience of not managing an API key.

The token markup that most AI tools bake in is exactly the kind of thing that makes you feel vaguely ripped off when you do the math. The feeling is correct. The markup is real.

Why Cursor's BYOK ban matters beyond coding tools

Cursor's decision to kill BYOK illustrates exactly how the incentive math works for AI tool companies.

BYOK users were Cursor's most active power users. They were doing the most with the product. They were also generating zero subscription revenue. That's a brutal unit economics problem. The company had two options: find a way to monetize BYOK users, or remove the feature and force migration to paid plans. They chose the latter.

The decision makes complete business sense for Cursor. But it reveals something important about any AI tool that offers BYOK as a feature: BYOK is always a candidate for deprecation the moment the company needs revenue badly enough. It's a distribution feature, not a commitment.

The tools where BYOK is genuinely safe long-term are the ones where the business model doesn't depend on monetizing model access at all. A one-time-purchase desktop app supports BYOK because it doesn't sell model access. That's how Advisor Prep Hero works. You pay once for the software, or annually for the Professional plan. You bring your own key. I have no financial reason to ever take that away from you, because I'm not in the business of reselling tokens.

That's a structural difference, not a feature toggle.

The three types of AI spend professionals should separate

I've found it useful to think about AI costs in three distinct buckets:

1. Foundation model access

This is what you pay to use Claude, GPT-4, Gemini, or any other underlying model. The cheapest path is direct API access at cost. The most expensive path is paying a subscription markup to an intermediary. For professionals with confidentiality obligations, the "cheapest" path is also the most defensible: your API key connects directly to the model provider, with no intermediary processing your client data.

2. Workflow tooling

This is separate from model access. It's the structured interface, the profession-specific templates, the file output, the organization layer built around how attorneys, CPAs, and consultants actually work. This is where it makes sense to pay a separate product, if that product provides genuine workflow value and is architecturally designed to keep client data off third-party servers.

The mistake is conflating these two buckets. A $20/month subscription usually bundles both model access and workflow tooling together, routed through a cloud backend. When you separate them, you can evaluate each honestly: is the workflow layer worth paying for, and is the data architecture defensible?

3. Output storage and organization

Where do the things you make with AI actually live? In a chat history that disappears after 30 days? In a proprietary database the vendor controls? Or in Markdown files on your hard drive that you own outright?

For professionals, this isn't just an organizational question. Client work product that lives in a vendor's cloud is client work product that's subject to that vendor's data breach risk, subpoena risk, and business continuity risk. If the vendor gets acquired, pivots, or shuts down, your client records may be inaccessible or exposed. That's a malpractice risk vector that most professionals haven't thought through carefully.

What Google's Gemini shift actually means

The Gemini Pro tier going paid is worth a quick note even if you're not primarily a Gemini user, because it affects the overall competitive dynamic.

For the past year, Gemini Pro was genuinely free at the API level for developers with modest usage. That made it an attractive option for professionals who wanted a capable model without paying for API access. That option is now gone. Pro models require payment, either through Google AI Pro ($19.99/month) or direct API charges at $2.00/$12.00 per million tokens for Gemini 3.1 Pro.

What this means practically: the "try before you buy" window for flagship AI models is shrinking across the board. OpenAI, Anthropic, and Google are all moving toward paid access for their best models. That makes the BYOK model smarter as a long-term strategy. You're going to be paying for model access either way. The question is whether you're paying at cost through an API key, or paying a marked-up rate through a subscription wrapper, while also routing your client data through that wrapper's infrastructure.

The actual BYOK ROI math for a heavy professional user

Let me make this concrete. Say you're an attorney doing serious AI work every day: drafting client memos, analyzing case documents, preparing deposition outlines, writing correspondence. Heavy use.

BYOK vs. managed AI cost comparisons consistently show that heavy users save the most from going direct-to-API. Here's a rough scenario:

Assume you're using Claude Sonnet 4.6 as your primary model and running about 40 meaningful sessions per day. Each session averages 3,000 input tokens and 1,200 output tokens. Monthly cost:

Input: 40 sessions × 3,000 tokens × 30 days = 3.6 million tokens/month × $3.00/million = $10.80
Output: 40 × 1,200 × 30 = 1.44 million tokens × $15.00/million = $21.60
Total: $32.40/month for extremely heavy Sonnet 4.6 usage

That's a heavy user paying about $32/month at direct API rates. Claude Pro is $20/month but comes with usage limits that a genuinely heavy user will hit. Claude Max 5x is $100/month for 5x the limits. The math at heavy usage starts to favor direct API access, even before you factor in the confidentiality architecture benefit.

Switch some tasks to Haiku 4.5 ($1.00/$5.00) or GPT-4.1 Mini ($0.40/$1.60) for lighter work, and your monthly API bill drops further. That routing flexibility is the real cost advantage of going direct-to-API. You pay the right price for the right model for each task, instead of paying a flat rate for everything, through infrastructure you didn't choose as a data processor for your clients.

What to actually do about this

I'm not arguing everyone should switch to raw API keys and code their own interfaces. Most professionals shouldn't do that. The friction is real.

What I am arguing is that you should do this math once, honestly, based on your actual usage. And then ask three questions that go beyond cost:

What am I actually paying for model access per month, including any markup baked into subscriptions?
Where does my client data go when I use AI, and have I actually read the data handling terms of every tool in that chain?
Do I own my output, or am I renting access to it through a vendor whose business model I'm funding with a monthly markup on inference?

The market's going to keep shifting. Cursor's not the last tool to change its terms around BYOK. Google's not done tightening its free tiers. The professionals who think clearly about this now, separate model access from workflow value, understand their data architecture, and own their output, will be in a much better position as both prices and regulations evolve.

I built Advisor Prep Hero on exactly these principles: you bring your own key for whatever models you want (Claude, OpenAI, Gemini), and everything you produce lives as Markdown files on your device. No subscription markup on tokens. No cloud database holding your client work. No risk of a BYOK deprecation announcement in your inbox. The Professional plan is $149/year and includes one profession pack. The Practice plan is $499/yr for all four packs and up to five seats.

The math on that, for a professional with confidentiality obligations, is pretty simple.

Download Advisor Prep Hero free →