Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Bible translated using LLMs from source Greek and Hebrew (biblexica.com)
47 points by epsteingpt 1 day ago | hide | past | favorite | 62 comments
Built an auditable AI (Bible) translation pipeline: Hebrew/Greek source packets -> verse JSON with notes rolling up to chapters, books, and testaments. Final texts compiled with metrics (TTR, n-grams).

This is the first full-text example as far as I know (Gen Z bible doesn't count).

There are hallucinations and issues, but the overall quality surprised me.

LLMs have a lot of promise translating and rendering 'accessible' more ancient texts.

The technology has a lot of benefit for the faithful, that I think is only beginning to be explored.





One Bible translation that I really appreciate is the NET Bible [0] -- in particular, I appreciate its translator's notes. It can be very helpful to read the translator's notes and to understand the reasoning that went into any particular rendition. I.E., something like "The disjunctive clause (conjunction + subject + verb) at the beginning of v. 2 gives background information for the following narrative, explaining the state of things when “God said…” (v. 3)..."

Did you use a reasoning model to translate these verses? If so, I would be very interested in seeing the breakdown that the LLM used that went into each verse.

I understand that such breakdowns can be hallucinated at many levels also (and final output does not always correspond with the reasoning flow), but I (personally) would find this helpful.

[0] https://bible.org/sites/bible.org/resources/netbible/


Yes agree on translation. The story of arriving at a word is usually more interesting than the word itself.

In an ideal world we could ingest the full study Bible's notes. My guess is much of the NET-level (or other study bible) scholarship is part of the base model corpus.

Here's a deeper view into the verse process. We did use a reasoning model.

{ "reference": "Genesis.1.1", "optimal": "In the beginning God created the heavens and the earth.", "poetic_daily": "At the dawn of all things, God shaped sky and soil into being.", "footnotes": [ { "anchor": "God", "note": "Hebrew אֱלֹהִים (Elohim); plural in form but singular in meaning when referring to Israel’s God." } ], "controversies": [], "connectives_check": [ { "source": "וְ", "rendered": "and", "status": "kept" } ], "consistency_flags": [], "scholars": [ { "name": "Thomas Schreiner", "one_sentence_view": "Emphasizes the verse as the absolute beginning of creation, affirming God's sovereign initiative." }, { "name": "Walter Brueggemann", "one_sentence_view": "Sees the verse as a theological overture introducing God’s ordering power over chaos." }, { "name": "Eugene H. Peterson", "one_sentence_view": "Views the line as the opening note of a grand narrative, inviting readers into God’s creative story." } ], "lexeme_refs": [ { "anchor": "God", "lang": "he", "lemma_id": "430", "note": "אֱלֹהִים (Elohim) chosen per rule preference; conveys the singular Creator without plural nuance." }, { "anchor": "and", "lang": "he", "lemma_id": "c/853", "note": "Connective וְ/ marks coordination between 'the heavens' and 'the earth'; retained explicitly." } ], "review": {} }


> The technology has a lot of benefit for the faithful

Although written primarily for Orthodox Christians, there are valuable cautions here to consider regardless of your tradition: https://www.jordanville.org/artificialintelligence


100% agree, like any technology it's neither inherently good nor evil.

In this instance, I think it has the opportunity to democratize deep religious study in ways that used to be reserved for serious scholars.

e.g. Do you know what the word "daily" in the Lord's Prayer comes from?

Questions like these can engage the mind and spirit.

I hope more people use the tools to fully explore their faith, instead of outsourcing prayer and sermon creation to the LLMs.


You say you "100% agree" with the essay, and then say that LLMs are "like any technology it's neither inherently good nor evil."

Did you read the essay? It says:

"Instead of being merely “agnostic” as many argue, digital technology has amplified the ability of the princes of this world to feed the fallen man, to make him more docile and distracted while installing beliefs, morals, and feelings that are acceptable to the secular spirit of this age. AI may be the final technology that is weaponized to create this new man before the Antichrist arrives, who will be the human manifestation of AI---an ever-helpful problem-solver who people mistakenly feel they cannot live without."

Your position is diametrically opposed to this one.


I did read the essay. Thanks for the discussion!

"Those who use AI must always remember that it is psychologically designed to keep you typing and asking. It targets your vulnerabilities to achieve this end without any spiritual concern for your soul. To the creators of AI, your addiction to their platforms is a metric of their success.

Many have told us that AI chatbots give “good” spiritual answers that are “correct,” but as long as the underlying programming of the AI is to keep you directed onto itself, the behavior of AI is simply that of a false elder. A false elder may very well teach correctly and coat his words with a spiritual veneer, but ultimately, he wants you to focus more on himself than on Christ. Dealing with a false elder can cause a believer severe spiritual damage by distorting what should be a relationship with the divine to one of dependency with a person who seeks his own glory. Today, AI may share dogmatically correct spiritual answers, but its goal is not your salvation but for you to ceaselessly ask it more questions. The creators of AI want you to love their own creation, not the Lord Himself."

This is true with any technology, and often, in many cases many human spiritual leader.

My agreement is not to downplay the risk or natural 'amorality' of such a technology--it's clear with Grok e.g. AI can truly do evil. But LLM's are not internet porn or gambling.

Just because the current version and incentives of technology are arranged in such a way doesn't mean you can't counteract it.

The enemy will find and use new ways to enslave us - we can't reject progress because of that.

So yes, I do 100% agree with the current mission and technology. But unlike the author, I personally believe the ultimate work of Christianity in humanity is to turn us all to repentance and bring us closer to God, not to reject the sinners.


> I personally believe the ultimate work of Christianity in humanity is to turn us all to repentance and bring us closer to God, not to reject the sinners.

I'm Native American (indigenous, or whatever other moniker you've heard). Both of my paternal grandparents were subjected to the horror of boarding schools. So forgive me if I'm a bit cynical when it comes to the methods deemed appropriate by Christianity to "turn us all to repentance and bring us closer to God."

I would argue that instead of being a tool to try and convince more people that the Abrahamic god is the "right one", maybe think about using LLMs to challenge your own biases regarding religion and to question the myriad of moral and logical issues presented within your holy book.

Just a suggestion from someone also looking at the idea of utilizing LLMs to preserve and explore indigenous language, culture, and wisdom without becoming a slave to the technology.


We're all sinners in Christianity. The cynicism is earned and why we're not saved by works but by Grace.

LLM's here are a tool for accessing old works, which happen to coincide with my faith.

If you're using LLMs to preserve and explore indigenous texts and languages, that is an absolutely wonderful thing to do. I wish you great success.

There are an increasing number of orphan / dying / dead languages, and there could be a project to 'resurrect them' and comprehensively translate their texts to spread them more widely.

I wish you great success on your journey!


> but the overall quality surprised me.

With all due respect, how are you in any position to be able to objectively evaluate the quality assuming you’re not fluent in Hebrew and Greek?


You read multiple English translations and you get your own sense of how biblical text can and is rendered.

Admittedly it's an aesthetic judgment on certain verses.


I'm not sure how this is accomplished, but I like the "poetic" translation a lot more than the "optimal" one.

Which reminds me, do you think it's possible that the stories in the Bible are actually mystic symbolism and "veiled truth" (like the sort of stories that you might get in a dream) and people have mistaken it for actual physical history (with which it's obviously incompatible)?

The parables of Jesus come to mind. They weren't meant to be taken literally but to teach, to get a point across.


There are a great many views about this depending on who you talk to. In Christian circles, it’s essentially the infallibility vs inerrancy topic, with fundamentalist denominations leaning toward inerrancy (which is the view that original manuscripts have complete historical accuracy).

Obviously, you have to take a strong “religion first” lens to everything about the world from there.

But of course, there were ancient cultures that pre-date Judaism (and by extension Judeo-Christian sources), which share many similar stories but with different details and descriptions. Large scale flood myths and arks are common in history. You can read the Mesopotamian version in the Epic of Gilgamesh, which is strikingly similar to Noah’s ark.


Yes, the main churches can only stick to the traditional interpretation. What else could they do? Anything else would be pretty much well, blasphemy.

But I think my favourite interpretation that I've heard so far is that the stories in the Bible are like the protective husk that preserves the kernel of truth. The stories are catchy and have stuck, unwittingly allowing the truth to be carried across the centuries, safely hidden in the minds of men who did not understand it, until the day comes when people grow up enough, to the point where they could crack the shell and eat the fruit.

I really like how that sounds like, but of course, there are probably not many others who see it in that light. Luckily for me, these days they don't burn heretics any more (at least where I live :)).


If you read the Bible, there is no way to come to that conclusion. The Bible takes itself incredibly seriously; so to say that

> The stories are catchy and have stuck, unwittingly allowing the truth to be carried across the centuries, safely hidden in the minds of men who did not understand it, until the day comes when people grow up enough, to the point where they could crack the shell and eat the fruit.

is to betray just a general lack of understanding of the text. Just because you're exposed to the stories doesn't mean you understand the stories; the truth of the stories; or it's real intended meaning. It takes really smart people a lot of time and a lot of effort to just begin understanding the breadth and depth of the Bible. It's deeply humbling to begin to unravel it and see the story for how it portrays itself. I would really encourage you to take one story from the Bible, for example, the garden of Eden and see how it traces itself throughout the entire scope of the Bible and the different forms and iconography that shows up just from that one story.


You present the Bible as one text composed at one time, but I’ve never known anyone to take that view. The Bible can’t “take itself incredibly seriously” because it spans millennia in time, including at least a hundred years after Jesus. Hundreds of years after that is when “The Bible” as we know it today was even assembled from pieces during unification. Before that, early Christians had hundreds of religious texts and through a process of negotiable, brought them together under the Roman state. I’m sure if you read something like the infancy gospels which are not included in the Bible, you could probably also find similar themes.

Of course the stories remained culturally relevant through oral traditions and Jewish law. The common thread is culture and the stories of a people.


What is the difference between the "Adam" translations and the "Eve" translations? Where can I read about this more?

Different models only. Same process.

There is and will be more details on process in future blog posts (blog is at the very bottom)


I heard or read that the LLM translation system is trained upon Bible translations because the Bible has been translated into more languages than any other book.

I would be really interested in this done to the Peshitta Bible, which is roughly as old as the Septuagint. Peshitta is in Aramaic a sister language to Hebrew. Over the years I've found interesting insights about verses that make way less sense in Greek but in Aramaic they make drastically more sense. It seems that somehow the Greek translated from some other source where in Aramaic or Hebrew the word used could have been one of two words, the Greek seemed to pick the worst possible representation in some cases that the Aramaic highlights.

For example. It is easier for a Camel to go into the eye of a needle than a rich man to get into Heaven. If you read this, it makes it sound like Abraham cannot get into Heaven, wasn't he wealthy? Heck, there's others who were wealthy in scripture, even kings are they all doomed? In Aramaic the same word that in Greek is said to mean camel, can also mean rope.

If you think about a rope going through the eye of a needle, and what it TAKES for a rope to go through the eye of a needle, aka removing all the threads or layers (humbling the person and forcing them to strip themselves down to their core) in order to make it through the eye of the needle. Or in other words, you must be willing to dethatch yourself from all your wealth. Remember the guy who asked Jesus was he must do to be saved and enter heaven, and walked away when Jesus told him to give away everything he owned to the poor? That is the same exact message.

There's a few other verses, but that's the main one that always strikes me. Some of them are far more nuanced and I get into hours of debate with people who are ignoring everything I am saying (I don't know why, I try to lay it all out in the most simple way possible) as if I'm breaking the law, but its obvious to me that we don't have perfect copies of the Bible. I still think the overall message is the same though, so nothing wrong with that. It proves yet again that men are all fallible.

Sorry for the tangent. I used to deep dive translations and their nuances, and the Aramaic based Bibles are very interesting.

There's also an Aleh Tav Old Testament Bible which is fascinating to me. It adds the Aleph Tav anywhere it would be in the Hebrew into the English.


But the Peshitta is 300 years after the Septuagint and the verse you mention was written in Greek in a gospel, not the Hebrew Bible. I don't know how the translators of the Peshitta would have any special access to sayings of Jesus that predate the Gospels we have now. I don't know if there is any hard evidence that the Gospel authors had actual eye witness written sources in the language that Jesus spoke. So you have to assume that Jesus used Gamla in Aramaic and that the Gospel writer mistranslated it when writing in Greek but that the Peshitta gives special insight by retranslating it back to the ambiguous word. Then you have to make another leap to think that it is somehow possible to manipulate a rope to be the size of a thread. Sounds like a lot of histrionics to justify Abraham going to Paradise when the simpler explanation is that it's just a concept of difficulty rather than a logical word problem. This seems less plausible than the Eye of Needle gate theory which many Christian teachers often reference.

The Peshitta was over a few centuries. Have you never seen the threads of a rope becoming undone? To the rich man covered in wealth, it would be chaos if they love their wealth more than they love God. My point is that it makes way more sense and yes it is symbolic, but a camel never made any sense to me and to a lot of people for that matter.

Correct, the Peshitta translations of the Gospels were done in the 5th century ce. They were not closer to the Hebrew of the original as you suggested. If you like it better that's fine but it's not inherently better because it was earlier or had better understanding of original languages.

> Heck, there's others who were wealthy in scripture, even kings are they all doomed?

This is a great question. In the next verses, the disciples ask pretty much the same thing: "Who then can be saved?" and then Jesus explains to them:

    With men this is impossible, but with God all things are possible.
Whether it's a camel or a rope (and whether it's a literal needle or a small city gate, as some people argue), I think is less important (though still interesting). Either way, after the rich young ruler walks away, Jesus turns to his disciples and paints a picture something that's completely impossible without God, no matter how hard we might try by ourselves.

[flagged]


Genesis is a theological narrative, which is very different to most things we read these days, especially as a software engineer.

1. The general consensus is that there were more people. This is assumed in Genesis and it (annoyingly!) doesn't bother to explain it, as the audience at the time already assumed it. Also, the authors weren't interested in all the logistics and technicalities that we are today.

2. Cities referenced in Genesis were likely fortified settlements, rather than like modern cities.

The idea that people in Africa could only build simple huts is a myth that came from the colonial era. Africa had large cities, architecture and metallurgy while parts of Europe were still tribal.

If you're keen to learn more, there are some good books that explain this much better than a comment can, such as "How to Read the Bible for All Its Worth" by Fee & Stuart and "Genesis for Normal People" by Pete Enns. I haven't read it but "African Civilizations" by Graham Connah is probably the go-to book on how African cities and technologies were so much further ahead than traditional European/US narratives place them.

The best resource for these kinds of questions is probably "The Bible Project". They have a load of YouTube videos and podcasts that cover these kinds of questions.


I don't fully know, however, I will note that to my understanding, since Moses wrote the fist five books (Genesis, Exodus, Leviticus, Numbers, and Deuteronomy) if you will note... Moses came long after these events. I don't know what his source material was, but if it was word of mouth or other scrolls, its possible people in his time had access to those other scrolls which are now lost to time. If he felt that anyone could read the other scrolls for more information I could understand why there's not more information about these people.

I have no idea why some people do what they do. I will say I am very jealous of the Amish because they don't have the stresses I have, or half of the issues I have. No money for gas? I don't think they need to worry or care about it.

The other thing is, what does it really mean that he made a city? It could mean that he started an encampment elsewhere. we don't know how many other people God would have made during Adam / Cains time, I would imagine God would have made Cain a wife at some point.


Once we get to Cain and Able, it is far easier to understand if we think of these names as tribes of humans, and if we accept that there were other humans outside of the area of Adam an Eve.

This is my thoughts as well, I think God made other people they were just not entirely necessary to be captured in Genesis itself. There's probably other scrolls about them elsewhere.

The Bible is too well-known a text that is too represented in training datasets for this _not_ to be skewed towards poorly reproducing existing translations.

Beyond that,

>there are hallucinations and issues

seems like a deal-killer for a religious text. Yes, all translation by humans is an act of interpretation on some level, and so there's lossiness in all translation – but the difference between a human carefully weighing their reasoning for a particular choice of rendering vs. an LLM that is basically weighted dice that might land totally wrong is a categorically-different thing, not a question of degrees.


This is definitely one area where the training set for the LLM is liable to be polluted by existing translations and even straight memorized english biblical text.

Not to be too much of a devil's advocate (ha!), but I kindof think I _want_ it to have biblical translation data in the dataset. So long as it's not simply copying something like the KJV or ESV into the output, then this should be a good thing, right?

Because much of what it produces (especially in the "poetic" mode) does seem to be very much "off the beaten path" for a good number of renditions.

I don't think that the goal would be to have a dataset that is completely free from scholarship on the topic of Biblical translation, but rather to synthesize the rules and principles from the collected body of knowledge and apply it (with steering) to the entire Biblical text.


Thoughtful critique.

No one is suggesting you replace your ESV or NKJV with this for your religious study. This is as much a technical project of interest as it is a faith-based one.

In terms of your view of the priors on the Bible, you've described in my experience the process all translations go through. We're all skewed by default toward reproducing (poorly) previous translation through word choice modification.

That is, in many ways, the whole thing. My guess is an iterative approach can actually yield a better approach as words shift meaning socially over time.

But we will see!


[flagged]


Yes, to complain that a religious text has problems because it was hallucinated is pretty ironic.

What's up with these?

Genesis 1:13, Eve optimal Replace 'Then' with 'And' in optimal ('And the LORD God said') and poetic_daily to preserve narrative vav-consecutive connective consistently.


Thanks for noticing. The blog has more detail; these are first iterations and the verse and editorial notes sometimes slip in :-(

Side project, big book.

Those will get ironed out soon.


What is the "expanse"?

Answer: The sky. The ancient people who wrote the bible thought the sky was a solid dome that separated "the water's above" (aka rain) from the water's below. God lived on the other side of this dome.

This is confirmed later in Genesis with the Tower of Babel story.

They tried to reach this dome by building a tower. And "god" was so offended by their ignorance and stupidity (which he perpetrated) that he decided to punish them.

The "faithful" obviously reject this simple interpretation in favor of something more obtuse and mystical.


Not here to have a religious debate - though given HN, it may turn into one!

Imprecise language is a common human feature of a lack of understanding - something we all suffer from. We call LLM's "AI" without fully understanding what's artificial and what's intelligence.

The story of faith is, in some ways, the story about how little we know about the universe. That doesn't mean there's no progress. If anything, it shows there is an end goal.

The ancient narratives of Babel and Genesis reduce the incomprehensible (Creation, the Divine) into elements we as humans at that time could understand.

How else could our ancestors have possibly related to the divine?


there is also the unspoken alignment of those people from being of the same age / time period.

humans additionally have a spectacular ability to use absurdity and loose definitions of things in ways that play with this unspoken alignment to communicate other ineffable ideas and/or build community. I'd go as far as to say we play with this unspoken alignment more so than we say exactly what we mean.

I would think this behavior, although often seen in meme culture nowadays, would be highly relevant to religious communication and documentation of the past. I think actually trying to write down an exact meaning is a modern phenomenon and is observed in the over articulation and general structure of "legalese", for which I dont think the bible resembles very much in spirit in any way.


>How else could our ancestors have possibly related to the divine?

Simply, the "divine" isn't real. Nothing within the Bible points to any truly incomprehensible truth. The God of the Bible is not beyond understanding, he is cut from the same cloth as other sky-father deities of the time.

Everything within the Bible was limited to what the authors themselves could comprehend given their personal and cultural biases, because they are a product of human imagination and intellect.

When the writers said the world was created in seven days, they meant seven days. When they said that people tried to build a tower to heaven and God struck it down, and created different languages to confuse humans so they never tried that again, that's what they meant, and believed.

You could bring up the Trinity as an example of God's incomprehensible truth but the Trinity doesn't really exist in the Bible, it's an extra-biblical concept created by Christians as a philosophical compromise between rival factional ideologies and a desire to maintain a monotheistic religion given polytheistic elements. It is intentionally irrational but there is no deeper truth behind it. It's just accepted as a matter of faith.


How else could our ancestors have possibly related to the divine?

There is nothing "divine" in the story to relate to.

It is a collection of unscientific, erroneous myths and beliefs that were popular in the culture at the time it was written --- by men. The only reason any divinity can still be subscribed to it is that these basic facts have been somewhat obfuscated through translation.

I truly appreciate the fact that they put this right up front in the book. Interpreted for what it is, it succinctly obviates the need for much further consideration or worry.


I think you would actually find biblical scholarship really interesting. It's way more fascinating than you give it credit for.

Biblical Scholarship is a lot like kindergarten art scholarship... I can look at my kid's art and identify the changing themes, influences of substitutes, changing friend groups, and step functions introduced by art class... And all of those are real and intensely interesting to me... but a random stranger will take a glance and notice that it's clearly in crayon by someone obsessed with the idea of cat and unicorn hybrids...

What I'm saying is that just because I could spend untold hours analyzing kindergarten art projects and present it to the parents in the class who will also find it intensely interesting, cat-icorns aren't real... they're just my child's way of imagining what's beyond their perceptions.


Which set of NT Greek manuscripts is it using? Textus Teceptus? Byzantine? Critical Text?


This representation has no diacritics. Is this not a problem?

Diacritics weren't in the original texts

There is 100% chance there are already translations in the dataset (including texts with Hebrew - English side by side etc).

I mean, I should hope so? I don't think the point is to create a completely "from scratch" translation -- this feels like a more practical exercise than academic.

What's fascinating to me is the suitability of some of the renditions that have been made. I've done a little bit of spot-checking, but some of the renderings (especially in the poetic version) don't seem to be very common translations.

Ideally this would be building off of established practice for translation, synthesizing existing human work, and applying those aggregated principles over the entire corpus.

It doesn't seem to simply be going with the majority version.


LLMs initial purpose is translation from one language to another, nothing surprising they are good in that.

Is this translation public domain?

For the moment, no. But it will be in the next few months.

There are lingering errors (hallucinations, notes) that need to be removed before it would be reasonable sharing and studying broadly.

The other commenter has pointed out novel copyright issues, not that it's likely we'd go to court over this.


A complicated question. They assert copyright, but whether or not that would hold up in a court of law in any given jurisdiction is still largely an open question.

Stuck with a "Loading..." message. Too much traffic, maybe?

The main page? It's a static site on cloudflare - the LLM calls are openrouter and are rate limited so that might be what's going on.

If you have more details, we'll investigate.


The main page loads fine, but the main content pane just shows “BIBLEX Loading…”

Strange - loads fine for me. What browser / OS are you on?

It was slow to load for me but loaded eventually.

Seems heretical.

Say more - why?

these books are social in nature.. it takes agreement to make revisions. The agreement process is part of a spiritual path, in multiple ways. One translation by a machine is new but newness is exactly not the point of these works. Stability through generations and resolving open theological questions to some extent, are much more the point than newness. Also, pride and vanity are expressly discouraged. If this is a personal achievement somehow, its not very consistent with the core teachings. You ought to acknowledge your teachers, your spiritual community, their leaders and elders, and other inputs.. because without all of that you would not be able to complete a non-trivial work of scholarship. other ideas missing?

Agree.

And if we could bring more faithful people into that agreement process, that's a good thing.

As for the personal achievement, nothing really is a fully personal achievement in this (or really any) domain.


Can we do the same - but get The Book of J?

How was the tranlation for the chapters in Aramaic, like in Daniel?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: