You are viewing archived messages.
Go here to search the history.

Matthew Linkous 2023-06-13 16:38:15

I’m been thinking about this concept of a “full-stack” database. I’m curious what connotations that invokes for folks here and what features you might imagine a futuristic, “full-stack” database would have? Bonus question: is it “fullstack”, “full-stack” or “full stack”?

Walker Griggs 2023-06-13 16:48:06

Full stack, in my mind, has a few connotations: level abstraction and accessibility.

Two things that come to mind:

  • Designed to be used by all front-end, back-end, and edge applications. A cross between firebase and postgres, if you will. In many ways DynamoDB feels like an interesting abstraction: NoSQL built on a MySQL backend. So my S1 when I hear "full stack" is a realtime nosql abstraction built on a strictly relational db (with an escape hatch to access the latter).
  • A DB with bundled, native ORM that speaks wire protocol. Most ORMs I see are built as an afterthought. "Full-stack" evokes notions of accessibility from a language API perspective. This point is more of a personal feel and not a practical concern though
Walker Griggs 2023-06-13 16:57:58

Bonus: I often see a micro-service deployed to translate between the business logic and the DB layers. That service often ends up being "endpoints on a stick" with a side of "validation and authentication"

A "full-stack" db in my mind would alleviate the need for such a service. Whether that means exposing http endpoints and natively translating to the query language or omitting the query sql language layer all together and exclusively bundling the ORM + query language + wire protocol, I'm not sure.

Your question is bringing a few thoughts on accessibility to the surface and I'm brain dumping

Matthew Linkous 2023-06-13 17:09:19

Very interesting! I like where your heads at.

Eli Mellen 2023-06-13 17:21:20

Have you looked into frontier?

Eli Mellen 2023-06-13 17:22:00

I also wonder about array programming languages, especially k/q that come bundled with a database, so the distance between language and data and database is minimal

Matthew Linkous 2023-06-13 17:22:28

Mmm never heard of that but that sounds similar to how SmallTalk is part database

Eli Mellen 2023-06-13 18:29:57

yeah! a similar vibe for sure

Kevin Greer 2023-06-13 18:56:24

Makes me think of a few things: 1. en.wikipedia.org/wiki/Naked_objects 2. My own FOAM Framework, which was an actual database in earlier incarnations, but now abstracts away the underlying database youtube.com/watch?v=S4LbUv5FsGQ, 3. I don't have a reference but I once read about someone writing their whole web application in Oracle stored procedures. 4. IBM's OS/400 reddit.com/r/hackernews/comments/13603zt/ibm_as400_databases_all_the_way_down_video_2019 5. Gemstone, the Smalltalk Database: en.wikipedia.org/wiki/GemStone/S

🎥 FOAM DRY + WET

FOAM DRY + WET

📝 IBM AS/400: Databases all the way down [video] (2019)

Posted in r/hackernews by u/qznc_bot2 • 3 points and 1 comment

Matthew Linkous 2023-06-13 18:59:32

Ah yes I've seen the IBM demo! So cool. Havent heard of FOAM before. Has that pattern / your framework unlocked any cool abilities for you? Kevin Greer

Kevin Greer 2023-06-13 19:31:09

We can write a lot less code to get things done much more quickly. Our model files are the equivalent of a database schema, but contain a lot more information. And from them we can generate the actual database schema, but also class definitions in our various target languages (Java, Swift, Javascript), GUI components like Table and Detail Views, and the network marshalling and API code so the various tiers can talk to each other. As well as traditional ORM-like code.

Matthew Linkous 2023-06-13 20:40:37

Very cool. That sounds similar (at least on the database side) to how I understand Facebook’s ENT framework works

Kevin Greer 2023-06-13 21:28:00

Thanks for the ENT link. Yes, it looks similar, as do many other ORM and middleware systems, but the difference is that FOAM is much wider, in that it spans all the way from the back-end of the back-end to the front-end of the front-end, so spans several languages in the process. This also helps to avoid a lot of the glue code generally needed to create a complete system.

Andrew F 2023-06-14 01:41:24

To me it connotes trying to abstract over an unreliable network, and the inevitable frustrating failures resulting therefrom. :') But that's just me being bitter. Better would just be a database that's usable in programs running in any setting, and maybe even makes it easy to synchronize them. Couchdb/pouchdb come pretty close, if you squint.

Kevin Greer 2023-06-14 03:11:28

The problem with giving raw database access to clients is that you then have to duplicate business logic, which is a lot of effort and error prone, and the database's built-in authorization isn't usually sufficient, so you would also like that to be somewhere trusted and secure, not on the clients.

Leonard Pauli 2023-06-14 05:42:53

A ~complete-stack~ db ~is~ the logic, ui, action/history, and networking engine just as much as it is the data persistence one. Declarative/functional/constraint based, allowing for streamlined authorization. Platform agnostic, through exporters from itself to different environments, while retaining functionally and adding native interoperability. If a field, eg. profile_picture is added, literally through less keypresses than letters in that name; (auto suggestions, inc. concept type with acl, UI, performance assumptions, and more context for later suggestions/auto coding); the whole system reflects the change. Eg. on user posts, their profile, editable with upload + crop functionality... No need to update model/migration/validation(backend+frontend+orm)/authorization/endpoints/frontend api client/forms/multiple ui components/spec/documentation/... "It just works", still with the ability to change every detail down to the HDL architecture for the accelerator co-processor that will perform the upscaling if the uploaded picture is too lowres, used when exporting the app straight to custom silicon instead of, let's say as a p2p wasm webgpu webxr spacial app with a mesh of persistence servers that also syncs with the iPhone/Android/macos/linux/chatbot(whatsapp etc) versions deployed in the same click...

Andrew F 2023-06-14 16:34:10

Kevin Greer not sure if you were responding to me, but I do strongly agree.. for the most part. The only exception is that with Couch, for my app, I'm pretty comfortable giving each end user their own database and letting them face the consequences of mucking it up. If they interacted with each other, it would get a lot more complicated real quick (my membership, billing, etc are in the standard relational setup).

Matthew Linkous 2023-06-14 16:35:58

Andrew F I’ve always loved the idea of giving each org/user their own database instance. Doesn’t work for all apps but when you can have that separation it’s awesome. Schema migrations can be tricker if you literally have different instances of databases

Walker Griggs 2023-06-14 16:49:18

A complete-stack db is the logic, ui, action/history, and networking engine just as much as it is the data persistence one.

I completely agree here. This is maybe what I was getting out with "a full-stack db in my mind would alleviate the need for such a service [db layer logic translation]". Databases tend to be at the center of our systems and, for the most part, services are organized according to their data access patterns. It seems absolutely reasonable to me that a "full stack" DB would handle and centralize core business logic without the need for each consuming service to duplicate logic.

Walker Griggs 2023-06-14 17:06:21

I’ve always loved the idea of giving each org/user their own database instance.

Matthew Linkous , not a loft thought, but this reminds me of basic.xyz. I wonder what their team would have to say about this conversation 🤔

📝 Basic Database

The serverless database for user owned data

Kevin Greer 2023-06-14 17:46:20

We created issue database and gmail clients which stored most of the data localling in the browser in indexeddb and then just behaved as though that were the real database. Two-way updates were synchronized between the client and server when the client was online. This had two nice advantages: it allowed for simple offline support, and it offered excellent performance because of data locality. Here's an example of the kind of client-side performance we had: youtube.com/watch?v=y9i4oW9dHHw

Eli Mellen 2023-06-14 17:47:33

Kevin Greer was this a production system or a prototype?

Eli Mellen 2023-06-14 17:48:27

I am really interested in this sort of architecture, but when I’ve explored deploying it to production, there was a significant edge case that ended up eating up a lot of lines of code to handle scenarios where users had limited device memory/couldn’t hold everything locally for some reason

Eli Mellen 2023-06-14 17:48:56

the naive approach I tried was just to fail to a total REST api, sort of normal architecture, but I wanted to find a way to do it incrementally.

Kevin Greer 2023-06-14 17:55:51

The issue trackers (there were two) were production and the GMail client was prototype. We don't need to hold everything. If the user does a query, we return the answer, then if online issue the same query to the server and update the client DB, this would cause an MVC observer event and force the GUI to refresh if there was more data. Then you could have a janitor to purge local data not viewed for some time.

Kevin Greer 2023-06-14 17:56:43

They're all open-source if you would like to see the code.

Eli Mellen 2023-06-14 17:56:58

I’d love that!

Kevin Greer 2023-06-14 17:57:22

I have a meeting in 3 mins, but will post links later.

Kevin Greer 2023-06-14 17:59:22

Here's a screenshot of the GMail client compared to the Android version:

📷 image.png

Kevin Greer 2023-06-14 19:15:32

Here's one of the issue trackers (the other one is Google internal):

Kevin Greer 2023-06-14 19:15:36

📷 image.png

Alex Cruise 2023-06-15 15:40:50

EdgeDB is also aiming in this direction:

Kevin Greer 2023-06-15 16:31:01

Why is it called "Edge" DB? From the name I would have thought that it would run in the browser or on mobile clients, but it looks like a server DB based on Postgres.

Walker Griggs 2023-06-15 16:50:26

The name feels like a misnomer. It's a compelling offering but the docs don't make any suggestion to running "close to the client"; cdn, on-device, or otherwise. Could be wrong though!

Matthew Linkous 2023-06-15 16:52:08

Yeah if I had to guess I bet they pivoted but kept the name. EdgeDB’s query language is super cool but I don’t think they’re doing anything that breaks the Postgres mold regarding deployment.

Kevin Greer 2023-06-15 18:27:13

Apparently it doesn't have to do with the edge of the network but the edges between objects (ie. relationships) with something they call "materialized edges". (From what I read on reddit, but I didn't see that on their website.)

Alex Cruise 2023-06-16 15:56:52

I still Want To Believe™ in something like OODBs but I don’t think anyone’s figured out how to scale them

Kevin Greer 2023-06-16 16:59:43

What's your definition of an OODB? An object-graph, or something like an Object-Relationshional database where the individual tables contain objects instead of tuples and the relationships between tables are themselves represented in the objects rather than externally through implicit SQL joins? Or something else? And at what scale are you consider? I worked on this project pipelinepub.com/competitive-communications-market/breaking-benchmarks-with-microsoft-and-redknee.html/4 which scaled well using the Object-Relational model over-top of other data-stores. The Microsoft SQL Server part of that story is a bit fraudulent since the performance critical components weren't stored there, just historic records.

Alex Cruise 2023-06-16 18:54:36

hmm.. I care most about having the domain model represented well, being able to use both uni- and bi-directional relationships as pointers, not having to worry about FKs… I was going to say I don’t really care about inheritance, but it always comes up when I’m struggling with ORM too.

One reason I’m not particularly sanguine about OODBs so far is I find I always need to be persnickety about the representation on the RDBMS side

Alex Cruise 2023-06-16 18:55:19

It’d be nice to have an ORM that was more like a constraint solver I guess, where I could describe properties of both the OO and RDBMS sides, and have it find a match

Simple Poll 2023-06-13 23:58:33

Unknown type section

Unknown type section

Unknown type section

Unknown type section

Unknown type section

Unknown type context

Eli Mellen 2023-06-14 00:05:38

to add some nuance to my response — I’m not intending to get one (I mean, honestly, because I’m cheap as hell) because I’m unclear if it’ll mesh with personal accessibility needs

Kartik Agaram 2023-06-14 00:07:14

I'm tempted but it looks like it's only going to work with Macs, which I don't work with anymore.

Andrew F 2023-06-14 01:32:44

It's funny that this (at time of writing) landslide against buying one was exactly what my first instinct said would happen here in this future-oriented forum. From a foundational standpoint, fancy new outputs are not that interesting (at least to me, though I admit weird inputs still have their pull). And to a large extent this is a foundations-oriented bunch of people. Half the time we're still trying to finish learning the lessons of 30 years ago. Would everyone stop with the skyscraper projects up there? We're still trying to shore up the mines all this stuff is built on...

(FWIW I second guessed myself a lot before getting as far as parsing the actual poll output, so no real life points for a good prediction.)

Kartik Agaram 2023-06-14 01:38:34

Andrew F All my projects tend to assume that the CPU is nothing, and what to do about side effects (particularly I/O) is the whole ballgame.

Andrew F 2023-06-14 01:44:09

Kartik Agaram that's not wrong, effects are what make the CPU more than a heater. But would swapping out a VR headset for a screen upend your architecture? Hopefully not. Of making many I/O modes there is no end, so I'm looking for methods to handle the churn. Among other things.

Ivan Reese 2023-06-14 01:46:36

I don't consider myself all too interested in redoing the foundations upon which our software runs. Whereas, I'm extremely interested in reimagining the interfaces through which we write code (or, ideally, getting away from things that could even resemble "code"). So I think this all tracks.

Andrew F 2023-06-14 01:51:25

It's not necessarily replacing Windows, but getting rid of recognizable "code" is still an endeavor in theoretical foundations, as I parse it. It's just taking a run at our knowledge stack of what computation is, rather than today's concrete tech stack.

Ivan Reese 2023-06-14 01:51:58

With respect to price, I'm reminded of Alan Kay's argument that if you want to predict 30 years into the future, you can go 10 years by spending a lot of money, 10 years by (can't remember the rest of the quote, also I'm already broke and haven't even bought this expensive headset yet, but dammit if I'm not determined to build a new spatial programming interface if it costs me all my kidneys)

Lu Wilson 2023-06-14 04:50:21

I used to find these sorts of things exciting. More and more I find them depressing. Hard to separate the different parts of it. I think I'm excited about the technology, but not excited about the novel ways it'll be misused by people and orgs .

Konrad Hinsen 2023-06-14 05:09:06

My response is similar to Lu Wilson's. This is exciting technology, but it's also technology for technology's sake. It will create new opportunities, but also new perceived needs (-> consumption -> resource use -> environmental degradation ...) and new ways to do harm. I think humanity should explore such new ways, but slowly and constantly watching out for unintended and unsuspected side effects.

In summary: I hope the VisionPro will remain very expensive for quite a while. The really motivated people can then play with it, but the rest of us can safely go on with their lives.

BTW, this is also my view on LLMs. Research and development, yes. Mass deployment, no. But since research and development is currently financed by mass deployment in an extractive economy, we can't have one without the other.

Jack Rusher 2023-06-14 07:22:48

We’ll likely get one to share at our lab (despite the custom fit). Leaving aside the obvious display stuff, I’m actually more interested in experimenting with the collection of input methods they’ve combined in this device. OTOH there’s a good chance it’ll induce too much nausea for me to use it for more than 20 minutes at a time.

Jason Morris 2023-06-14 04:37:10

Does anyone here know anything (or know someone who knows something) about data governance practices in enterprise environments and is willing to answer some newbie questions? I think there may be a strong use case for my tool in that space, but I'm uninformed.

Vijay Chakravarthy 2023-06-14 04:44:04

I’m happy to try to answer, or could intro you to people.. been in the enterprise SaaS space for a while..

Eli Mellen 2023-06-14 10:51:03

I’ve worked on a few platforms that include guidance for data and other sorts of governance. Happy to answer whatever I can.

Jason Morris 2023-06-16 20:01:47

Thanks, both. Here's my initial Q. It feels to me like this involves setting out data governance policies that say things akin to "don't give access to personally identifying information unless the user has admin privileges", and then what happens is that the people running the software need to configure their tools to comply with that rule. That feels like it would a) make it more complicated to add new tools, driving organizations to centralize, and b) be difficult for the people writing the policy, who presumably don't know the software tools, to confirm has been implemented correctly. I'm presuming that they do some form of user testing to see if they can find violations, but they don't validate the configurations directly, and there is nothing in the way of real-time or near-real-time auditing. Is that more or less how it goes? Or is the tooling more sophisticated than that?

Eli Mellen 2023-06-19 10:41:51

Youve defo hit a nail upon the head with mighty and thunderous hammer.

The distinction between governance and what I’ll call “programmatic gate keeping” is often times 2 overlapping circles. Defining what needs to be enforced “where” can help.

In your example of access rules: don’t bother to include “governance” level guidance about that. As you say it can be nuanced and complicated. Enforce access rules in code if at all possible, and make the governance about how to use the right and good code pathway. E.g. “we’ve got oauth set up. You must use it.”

Here governance and code world are combined, and become stronger and (ideally) self supporting.

Eli Mellen 2023-06-19 10:47:43

I think of this governance stuff as guidance for devs and other humans that direct them toward the correct and expected practices that you can then “enforce” or “verify” in code.

Where it gets tricky is when you’ve got stuff that needs “governance” that can’t be enforced in code, or codified into a CI/CD pipeline for some reason. For me, the canonical example of this is the focus of my job — accessibility.

There are accessibility auditing tools that can run automatic scans as part of a build and deploy pipeline, but they’re not really all that good. There are a bunch of accessibility rules that are codified into literal laws. So, governance says “do the things the law tells you to” so now you’ve got a 3rd category, somewhere between

  • governance
  • programmatic gate keeping

There is now also “guidance.”

Guidance, in my meaning here, is the worst kind of (read perhaps as “most difficult to verify?”) governance because it relates to code practices that are critical but potentially reliant on individual implementations and are difficult to verify…

Eli Mellen 2023-06-19 10:52:28

Also, this is, from my understanding what lead to the invention of SAFe Agile…so, um…beware, for here be seriously asinine management practices ⚠ ⚠ ⚠

Eli Mellen 2023-06-19 11:19:34

Your insight about the drive to centralize I think is correct. The oldest platform governance structures I’ve seen or had to interface with, the ones that I’d say were “successful” in their goals were all centralized, and had a clear group or person, acting as the authority…a governance group that had a governance process to gate keep what was in and out. This leads to consistency at the coast cost of speed.

It also centralizes failure to a single authorities group in some cases.

Jason Morris 2023-06-19 16:43:20

Correct me if I'm wrong, but whether it is enforced automatically, or manually audited after the fact, it still represents an objective that was decided upon somewhere, like "use Oauth" or "comply with accessibility laws". That's the "governance" part, I would think ... the choosing what to achieve and what to avoid (and potentially but not necessarily how)? Also, I'm wondering specifically about "data" governance, which includes access security but excludes accessibility, I would think. Is "data" governance different, somehow? Is the difference meaningful? Or have we created a sort of arbitrary division because people think of data as having value now?

Eli Mellen 2023-06-19 16:44:19

when you say data governance, do you mean like access controls or like what data we keep?

Jason Morris 2023-06-19 16:47:40

From what I have been reading, it deals with security, cleanliness, non-duplication, standards and schema compliance, privacy, usability, accessibility, regulatory requirements about non-collection or disposal. Also of the organizational decision making surrounding the collection, use, management, and disposal of data.

Jason Morris 2023-06-19 16:50:10

But if I told you that the people I'm talking with in the "data governance" seemed like they had a clear idea of what their job was, I would be lying. So it may be a "literally no one knows" sorry of situation.

Eli Mellen 2023-06-19 16:52:36

Gotchya! Yes. I was speaking a bit more generally of “platform governance” than specifically data governance.

I’ve been involved with data governance within platform governance, too, but would say, as you point out, it’s a lot less well defined since what exactly “data encompasses is a wee bit hand wavy.

Jason Morris 2023-06-19 17:04:21

Honestly, from what I have read, in the phrase "data governance" you have two terms that are deeply hand-wavy. But thanks for the clarification.

Jason Morris 2023-06-19 19:07:25

I think I'm slowly starting to hone in on what's important for me, here... The thing that the tool I am working on offers is an easier and more verifiable symmetry between written rules and their encoded equivalent. That tends to be most useful when following the rules is very important, but the requirements and text of the rule are not under your control, and the rules are complicated, and demonstrating adherence to them in automated systems is important. In data governance, with the exception of things like GDPR, you usually have control over the rules. If they are hard to implement, you can rewrite them. So I'm curious whether there is a pain point in data governance around automating and demonstrating compliance with written rules, in the terms of those written rules . For instance, if it was possible to run something in CI/CD that would take test inputs and outputs, detect data policy violations, and flag them with a link to the written policy they violate ... is that anything? Does that make anyone's life better? Or is that like a hat on a hat?

Eli Mellen 2023-06-19 19:11:41

I think that is something!

Cynically, I think it is something because it is an element that can be automated, and plugged into an existing system. In my experience, folks want people out, and CI/CD automation in — automat as much stuff as possible, no matter the implications of what it means for the system.

Vijay Chakravarthy 2023-06-19 19:14:19

So I think the presence of data governance and compliance rules is very valuable for enterprises but hampers agility and iteration based testing. So you have IT/Infosec being the bad guys while other parts of the org want to innovate. If you can solve for this cleanly, lots of value there.

Jason Morris 2023-06-19 19:20:11

Yeah, I'm just thinking the abstraction layer becomes where the bad guys live. Tell us about your software's data model, and we will build a connector to our compliance API, so you can automatically test or audit. But if your model doesn't fit our API, we are still the bad guys... I'm not sure how much better that is.

Vijay Chakravarthy 2023-06-19 19:22:27

Are you thinking of doing this as a product?

Vijay Chakravarthy 2023-06-19 19:23:31

not sure if thats relevant

Jason Morris 2023-06-19 19:47:23

LexiFi is in the same "computational law" space that I live in, but it solves for a different problem. I'm looking for structural isomorphism to the legal text, and ease of use (ala low- no-code). Functional programming gets you neither of those things, but gets you other cool stuff. Catala is another functional approach in the space, for tax and benefit systems. OpenFisca is object-oriented, aimed at comparative analysis and microsimulstion of tax & benefits rules. I think the Accord project has a language aimed at smart contracts, there's DataLex from AustLII, and a few others.

Jason Morris 2023-06-19 19:49:08

My entry in the space is Blawx, which is currently still more prototype than product. I am currently hosted inside a department of the Canadian federal government that is trying to find an in-house demonstration use-case, and data governance is the most promising current proposal (of about 4 options). Just trying to understand whether there is a problem/tool fit, or not.

Jason Morris 2023-06-19 19:49:48

I have a meeting with the CDO later this week, trying to understand more before I get there.

Jason Morris 2023-06-19 19:49:58

Appreciate all the help!

Vijay Chakravarthy 2023-06-19 19:53:07

What would be examples of roles that could help answer questions for you. My take is that the buyer here would be on the innovation side, the governance side is likely to have veto powers but is not necessarily a purchaser. My network is pretty strong on the SaaS side, lemme know if I can introduce you to specific personas…

Jason Morris 2023-06-19 19:57:21

Yeah, that's one of the issues. It's fundamentally a bit of a two-sided tool. You need someone responsible for writing the rule but not for implementing it, and someone responsible for implementing it, but not for writing it. The rule expert validates the encoding, the implementer just uses the API. So in data governance, I'm interested in data governance policy writers, who wish the implementers were better at following the rules, and software devs and data stewards, who wish following the rules was easier, plus whoever is paying them, and wishes less time was spent on compliance.

Jason Morris 2023-06-19 19:59:11

If you are right, and the purchaser is the innovator, then I need to talk to people who are building software or dealing specifically with software compliance testing in heavily regulated enterprise environments. Or something? :)

Vijay Chakravarthy 2023-06-19 20:00:29

yup - healthcare, banking, insurance..