# Big Tech was moving cautiously on AI. Then came ChatGPT.
source: https://ift.tt/BhPDRJf tags: #literature #inbox uid: 202302201658 —
People feel like OpenAI is newer, fresher, more exciting and has fewer sins to pay for than these incumbent companies, and they can get away with this for now,” said a Google employee who works in AI, referring to the public’s willingness to accept ChatGPT with less scrutiny.
The technology underlying ChatGPT isn’t necessarily better than what Google and Meta have developed, said Mark Riedl, professor of computing at Georgia Tech and an expert on machine learning. But OpenAI’s practice of releasing its language models for public use has given it a real advantage.
“For the last two years they’ve been using a crowd of humans to provide feedback to GPT,” said Riedl, such as giving a “thumbs down” for an inappropriate or unsatisfactory answer, a process called “reinforcement learning from human feedback.”
Moving from providing a range of answers to queries that link directly to their source material, to using a chatbot to give a single, authoritative answer, would be a big shift that makes many inside Google nervous, said one former Google AI researcher.
Moving from providing a range of answers to queries that link directly to their source material, to using a chatbot to give a single, authoritative answer, would be a big shift that makes many inside Google nervous, said one former Google AI researcher.
I was talking about this with Emile yesterday - he uses Chat GPT in this way, as a Google replacement for learning programming languages
I Don’t Understand How Eric Zhang Is So Productive
https://www.ekzhang.com/resume https://www.ekzhang.com/projects https://www.ekzhang.com/writing https://twitter.com/ekzhang1
Seriously, like wtf… This guy is like Linus Lee (https://thesephist.com/posts), but even more ridiculously intimidating. How does he have the time to
- write out insane latex notes for all of his classes
- make like 20 ridiculously good open-source tools
- be a founding engineer at an extremely technically sound company
- be a senior in college??
- post on twitter
- write great blog posts
- have a beautiful personal website
- have friends?
WTF… where is the time? I’d guess that the answer is he’s so genuinely efficient that he barely spends any time on setting up his side projects because he has an entire infrastructure stack already set up, and he’s very fast at programming and practiced at bringing ideas from 0-1, so he can get things set up ridiculously fast. I’ve seen an example of this with Keyhan. Some people are just incomprehensible.
Is it worth it thinking about joining Modal just for this guy? It sounds like the kind of high-risk, high-reward thing that Ben Kuhn would recommend me to do. Don’t be scared of overconfidence 202212101111. The quality of one of the founding engineers is a high quality heuristic for the quality of the founding team and the technical strength of the product in general, IMO. Also, search for outliers (https://www.benkuhn.net/outliers/), and this guy is one, huge, outlier
uid: 202302200143 tags: #random #thoughts #people-that-intimidate-me #recruiting
# The Magic of Small Databases
source: https://ift.tt/jbP54y6 tags: #literature #inbox #relational-thinking #living-well #productivity uid: 202301292232 —
We’ve built many tools for publishing to the web - but I want to make the claim that we have underdeveloped the tools and platforms for publishing collections, indexes and small databases. It’s too hard to build these kinds of experiences, too hard to maintain them and a lack of collaborative tools.
But it shows some of the potential use cases where people are looking for “database publishing”.
Let’s think through the kinds of use cases and functionality that is important. I’d say that the core features are:
- Creating collections of objects
- Adding and updating metadata for the objects (ideally in bulk if needed)
- Creating collections, relationships and pathways through the data
- Collaborating on these with others
- Publishing to the web in an easy to consume way, with stable URLs, search etc
- Open standard file formats for export / import / desktop editing
I’d like to call out Datasette in particular. Datasette is “An open source multi-tool for exploring and publishing data” created and maintained by Simon Willison.
Of all the projects I’ve come across Datasette feels so closely ideologically aligned to what I want. It’s open source, Simon really cares about helping non-developers use it (via a cloud product, a desktop product etc) - it’s even building out web-scraping functionality! Datasette is like a swiss army knife tool for exploring data but also allows you to publish that data to the web.
# Four thousand weeks
source: https://ift.tt/hL4xzMq tags: #literature #insights #living-well uid: 202301260144 —
We are forced to accept that there will always be too much to do; that you can’t make the world run at your preferred speed and so there are tough choices to be made: which balls to let drop, which people to disappoint, which cherished ambitions to abandon, which roles to fail at.
Once you truly understand that you’re guaranteed to miss out on almost every experience the world has to offer, the fact that there are so many you still haven’t experienced stops feeling like a problem. Instead, you get to focus on fully enjoying the tiny slice of experiences you actually do have time for.
Other human beings are always impinging on your time in countless frustrating ways. In an ideal world the only person making decisions about your time is you. This comes at a cost that’s not worth paying.
I used to think that having to coordinate schedules and plans with other people was not worth my time. I was so mistaken.
However, the two things must be mingled and varied, solitude and joining a crowd: the one will make us long for people and the other for ourselves, and each will be a remedy for the other; solitude will cure our distaste for a crowd, and a crowd will cure our boredom with solitude.
# Don’t Block the Event Loop (or the Worker Pool) | Node.js
source: https://ift.tt/OWUESla tags: #literature #inbox #javascript #programming #software-engineering #airtable #javascript uid: 202301260138 —
You can create and manage your own Worker Pool dedicated to computation rather than the Node.js I/O-themed Worker Pool. The most straightforward ways to do this is using Child Process or Cluster.
~Today I learned that the “Worker Child” uses an actual Node.js concept called the “child” process, and isn’t just a random abstraction that Airtable created.~
Actually I think that’s wrong.
However, for complex tasks you should consider bounding the input and rejecting inputs that are too long. That way, even if your callback has large complexity, by bounding the input you ensure the callback cannot take more than the worst-case time on the longest acceptable input. You can then evaluate the worst-case cost of this callback and determine whether its running time is acceptable in your context.
JSON.parse and JSON.stringify are other potentially expensive operations. While these are O(n) in the length of the input, for large n they can take surprisingly long.
If your server manipulates JSON objects, particularly those from a client, you should be cautious about the size of the objects or strings you work with on the Event Loop.
Splitballing Of Ideas Of Things I’ve Worked On For Airtable for My resume
Owned permanent deletion, a highly important feature that slipped through the cracks
- coordination between multiple teams - service orchestration, even talking to CSMs who need specific information their enterprises are asking for.
Main resilience
- find initial motivational posts
- worked for 1.5 years, making the biggest single point of failure in Airtable’s architecture resilient
- caching (what were the impacts here? How much did caching improve? What’s the QPS?)
- developed caching abstractions
- shard assignments - in hundreds of callsites throughout the codebase, in all possible process types
- develop shard poisoning algorithm to guard against race conditions in worker assignment, to make sure that we would not try to commit data on the wrong database shard
- heavily relied on multiple sets of transactions of the main shard and live shards, and any time you have multiple transactions in parallel operating on the same local state, you can have race conditions
- designed and implemented circuit breaker algorithm for
- developed circuit breaker algorithm for transparency falling back queries to the replica.
Onboarding buddy. Was promoted to L4 after exceptional rating in less than a year.
Spearheaded Airtable’s new grad program from scratch. Designed questions, led onboarding sessions, and provided a safe space.
uid: 202301260114 tags: #airtable #recruiting