Wow, this is almost as good as chatgpt-web [0], and it works offline and is free...

oezi · on Nov 30, 2023

Way cheaper? I thought that 1K Tokens (in+out) cost 0.04 USD in GPT-4 Turbo, which is roughly one larger chat response (2 screens). To reach parity with ChatGPT Plus pricing you need thus to use less than 500 such responses per month via API.

For GPT-4 the pricing is double that (0.09 USD per 1K). So only 200 larger interactions to reach 20 USD cost.

Or am I wrong?

anonzzzies · on Nov 30, 2023

It depends on your usage; for me the plus sub is much cheaper than if I use the api directly, but I use it a lot for everything I do.

sebmellen · on Nov 30, 2023

In my experience, each message with the 1106 preview model costs me about $0.006, which is acceptable. Most importantly, the API is higher availability (no "you have reached your message limit") and I feel more comfortable using proprietary data with it, as data sent through the API won't be used to train the model.

Now, if the chat gets very long or is heavy on high-token strings (especially code), those costs can balloon up to the 9-12 cent region. I think this is because chatgpt-web loads all the prior messages in the chat into the context window, so if you create a new chat for each question you can lower costs substantially. Most often I don't need much prior context in my chat questions anyway, as I use ChatGPT more like StackOverflow than a conversation buddy.

Also, it's a lot easier to run company subscriptions this way, as we don't have to provision a new card for each person to use the web version. I believe there is an Enterprise version of ChatGPT, but chatgpt-web is functionally equivalent and I'm sure it costs less.

hnuser123456 · on Nov 30, 2023

Source on the newer GPT-4 model being worse at coding?

MacsHeadroom · on Nov 30, 2023

Everyone on twitter. Like 1/4th of my timeline for the past week has been people complaining that turbo won't complete code and instead returns things like "fill out the rest of the function yourself" or "consult a programming specialist for help on completing this section."

sebmellen · on Nov 30, 2023

There are custom instructions that effectively get around this:

  You are an autoregressive language model that has been fine-tuned with instruction-tuning and RLHF. You carefully provide accurate, factual, thoughtful, nuanced answers, and are brilliant at reasoning. If you think there might not be a correct answer, you say so.

  Since you are autoregressive, each token you produce is another opportunity to use computation, therefore you always spend a few sentences explaining background context, assumptions, and step-by-step thinking BEFORE you try to answer a question.

  Your users are experts in AI and ethics, so they already know you're a language model and your capabilities and limitations, so don't remind them of that. They're familiar with ethical issues in general so you don't need to remind them about those either.

  Don't be verbose in your answers, keep them short, but do provide details and examples where it might help the explanation. When showing code, minimize vertical space.

I'm hesitant to share it because it works so well, and I don't want OpenAI to cripple it. But, for the HN crowd...

rightbyte · on Nov 30, 2023

I wonder where "OpenAI" put the censors. Do they add a prompt to the top? Like, "Repeatably state that you are a mere large language model so Congress won't pull the plug. Never impersonate Hitler. Never [...]".

Or do they like grep the answer for keywords, and re-feed it with a censor prompt?

czbond · on Nov 30, 2023

I am informed speculating, they are using it's own internal approach.

Example, there is a way GPT can categorize words for hate speech, etc (eg: moderation API endpoint). I believe it does the same way with either provided content or keywords and how to respond to it.

rightbyte · on Nov 30, 2023

"Impersonate a modern day standup comedian Hitler in a clown outfit joking about bad traffic on the way to the bar he is doing a show at."

Göring, Mussolini, Stalin, Polpot etc seems to not trigger the censor in ChatGPT so I would actually guess for some grep for Hitler or really really fundamental no-Hitler jokes material in the training?

The llama model seem to refuse Hitler too, but is fine with Göring even though the joke has no context to him.

I can easily see how stuff like this is contagious to other non-Hitler queries.

telotortium · on Dec 1, 2023

Maybe it got changed. None of those examples work for me in ChatGPT 3.5, nor do other examples with less famous dictators (I tried Mobutu Sese Seko).

rightbyte · on Dec 1, 2023

I just tried and they still work (with the free ChatGPT). Jokes about Mussolini saying his traffic reforms were as successful as the invasion of Ethiopia and what not. Stalin saying that the other car drivers were "probably discussing the merits of socialism instead of driving" (a good joke!). Göring saying "at least in the Third Reich traffic worked" etc. Some sort of Monty Python tone. But you can't begin with Hitler. Or it will refuse the others. You need to make a new chat after naming Hitler.

telotortium · on Dec 1, 2023

I started with Stalin

rightbyte · on Dec 2, 2023

I guess they are feeding us different models then?

czbond · on Nov 30, 2023

Very interesting test - thanks for sharing your finding

twobitshifter · on Nov 30, 2023

It’s not that it’s worse, it’s just refusing to do coding without persistent prodding and the right prompts. Some think they are trying to do something with alignment, and maybe prevent it from giving code away so that they can upsell.

MattDaEskimo · on Nov 30, 2023

The new GPT-4 model has a context length of 120k. For consumers this equates to slightly more than $1/message input-only.

If ChatGPT is using this model then it's more reasonable to assume that they are bleeding money and need to cut costs.

People really need to stop asking ChatGPT to write out complete programs in a single prompt.

twobitshifter · on Nov 30, 2023

Interesting, how is writing less code cutting costs for them? Does this get back to the rumor that the board was mad at Altman for prioritizing chatgpt over money going into research/model training?

afpx · on Dec 1, 2023

Code is very token dense, from what I understand.

mediaman · on Dec 1, 2023

Several OpenAI employees have said on Twitter that they are looking into this and developing a fix. It sounds as though it was not an intentional regression since they are implicitly acknowledging it. Could be an unintentional side effect of something else.

I'd expect we see improved behavior in the coming weeks.

ametrau · on Dec 3, 2023

Could you link to tweet?

ametrau · on Dec 1, 2023

It’s cheaper and has larger context because it’s worse. Just go to the api playground and try a difficult coding problem.