The tricky part of that is the loading and mapping from the companies representation of the data to the SaaS representation of that data to run whatever analysis the product wants to deliver. That step can be almost as much work as just writing the application from scratch since now you may have 2 copies of the data you need to keep in sync, the actual copy the company uses and the copy used for the SaaS analysis (and often multiple different copies for each SaaS the company is trying to use to substitute for just writing the software themselves).
I worked briefly with a startup who's idea was to create an index of your entire corporate data (PDFs, powerpoints, docs, etc) and their product would intelligently serve up documents as you typed.
So for example, if you would pen an email/slack talking about a slide deck from a recent meeting, in the widget the product would have already found the deck ready for you to drag and drop it into the message.
It worked freakishly good, and saved a whole bunch of time trying to find a file in the mess that everyones Google Drive becomes.
The problem they were having was no enterprise wanted to hand over an entire copy of their data for them to index.
AWS just introduced Kendra to do the same thing. They will likely suffer the same concerns. Google had a search appliance that I thought was a brilliant idea but never got much traction: https://en.wikipedia.org/wiki/Google_Search_Appliance
Google Search Appliance was used by thousands and thousands of big corporations - so to say it didn't get much traction is not correct. For many years it was the defacto solution for enterprise search solutions.
Huh, I basically never really heard of it being used. Guess that's the trick with 'enterprise solutions' outside of explicitly looking for it or it popping up on an aggregator like HN or /. (since this was starting in the mid '00s) you don't really hear about them.
That's another difficulty. Even if the actual transfer is relatively painless it is a risk every time your company has given a copy of customer data to another company.
Using AWS for example all it takes is a misconfigured bucket to expose massive amounts of customer data.
We basically did that for healthcare (electronic medical records) in the mid 2000s. Our customers (actual users) loved us.
Like your enterprises, no hospital (or equiv) would consider exfiltrating their data.
So our gear was physically deployed and operated on each hospital's own system. Our marketing and sales goons called this our "federated data model", which had no tangible connection to reality.
I likened the experience to building model ships in a bottle, while wearing mittens, while someone is punching you in the face. To mitigate, we created what would now be called CI/CD pipelines. We managed 100s of deployments with near zero downtime and effortless rollbacks. I'm rueful just thinking about it.
This strategy didn't survive the 2008 economic crash.
There are multiple, mature products in the "enterprise document management" space that work similarly to the product that you describe, available on premises and in the cloud[1]. There's also (smaller, specialised) consulting companies that show how to implement these products into a business. There's some really nice features like robust OCR, meaning after scanning, you can search immediately for document metadata, like invoices numbers- for companies that insist on having paper documents as part of their processes, or crusty old management that refuses to have digital signatures.
The challenge is the same as the main topic of this thread- change management. You have to force employees to manage their documents exclusively within this system rather than dumping everything into shared drives and using Outlook attachments for everything.