General updates

  • Lucas brought planning stuff to a close over the last few weeks
  • Looked into Auto-ML stuff
  • Singapore officers are still pretty closed, not fertile grounds for a good game until at least July, or midway through July
    • Singapore has moved into initial phase of re-opening stuff
  • For the next month or so, tie up work on planning, and then bring together the RL and planning work together

Akaash RL work

  • Submitted clustering paper to ACEEE conference
  • Came back with thorough list of revisions First point:
  1. Paper is pretty technical for the conference, building and energy saving programs
    • Could target for a less technical audience
    • Make the flow of the paper more clear
  2. To clear up the flow of the paper, wanted to create a box and arrow diagram of the methods
    • Both Lucas and Akaash created a draft, and see if they can edit it and combine different aspects together

Akaash’s flowchart of the model

  1. Choose a supervised learning model, that we can use once we have predicted clusters. The dependent variable of the supervised learning problem is the energy change after the game. The clusters that have the greatest strength in predicting which groups would have a change in energy after the game is the best one
  2. Choose a clustering approach - k means
  3. Apply the regression on all data, and then on just the clusters, and evaluate the output
    • Look at p-values, and see if they are statistically significant

Lucas’s flowchart

Given energy data, you can either

  1. Segment data - energy data before experiment
    • Different types of clusters, kmeans, gmm, etc
    • Evaluate performance predicting energy after the experiment
    • Winning clustering regime”
    • Persistence on most responsive cluster
  2. Regression on the whole dataset
    • General persistence of the energy saving behavior

uid: 202006031828 tags: #raise #meetings

February 22, 2023

Closing airflow was the biggest contribution Biggest net accomplishment:

  • Taken workflow engine, swapped it out and replaced it with a new one
  • Every company is going to be in the middle of migration
  • Got the not yet working system working

Dealt with the decline processor into the loader framework When we had issues with production, got them fixed

  • Fixing production issues is pretty good as well!

Kafka mock stuff, hypothesis stuff, big steps in the testing infrastructure, tested it in the decline ingester, found a bug based on the testing tool Property based testing (rather than having a tiny handful of golden examples, generate random examples) is a lot more sophisticated way of doing testing

Markdown in datanav

What you look for is accomplishments, achievements, things that move the needle

All code that is in production, touching critical systems

  • More work to get context on existing systems and not break things

People are really looking to how I’m contributing to the overall direction of the company, how did these things actually benefit the larger organization as a whole

Dealing with the messiness of the real-world, already code that’s there and you have to modify it

People reading the resume might not know what Kubernetes is, but they will know what it means to provide business value

Mentioning that I changed the ETL system, swapped a big part of it out, mentioning Kubernetes is important

Wouldn’t be bad to mention the pipeline aspect, data pipelines are used with machine learning, now a lot of companies are building up data pipelines

There’s a new job that’s forming in the last few years (machine learning engineer) - focus is on running models in production

The whole process of taking raw data at the beginning of that pipeline, and converting it to usable data is pretty valuable - Other companies might be interested in how Theorem does it

Theorem might also up a machine learning engineer job

Student membership in IEEE, or ACM, that has job links Looking for people that have academic qualifications, or looking for students

Try to see if I can get a software contractor job

Kelly worked for Matrix Resources, the firm that provided the labor count

Triplebyte, some of the other angellist stuff like that,

Ping career center, it’s worth a shot


uid: 202005261444 tags: #2020recruiting

February 22, 2023

Overview of Theorem

General Overview

Theorem Fund

  • Managed assets, loans debt

Theorem Company

  • People, strategy, IP

LPs

  • University endowments, rich dudes, pensions, etc
  • Give theorem money, get some money back? Missed this part

Partners

  • LC, Upstart, Prosper
  • get a fixed amount of money for each loan that they originate
  • Sell the loans to someone who wants something more stable
  • Loans provide principal and interest that increase our returns
  • Core interaction of research

In practice, things are slightly more complicated, there might be a third party (like a bank)

  • Bank might hold a loan for a short amount of time, or bank might serve as a trust

Banks/Capital Markets

  • Can offer to give up some of our loans in exchange for cash
  • We think our loans will return more than our cost of borrowing
  • Will help increase our return
  • Will put up a loan collateral so we can get more money into the fund
  • Warehouse facility
  • ABS - Way to package loans together to change the risk-return profile as compared to individual loans

Partners

  • Three main things we do:
    1. Active selection
    2. Passive purchasing
    3. ABS

Active selection

  • Get data from LC, prosper
  • Get all data after origination, including the stuff that tells them whether they should take the loan or not
  • All the loans are currently unsecured
    • Mortgages are unsecured
  • LC and Prosper want to produce as many loans as possible, because they get a fixed amount for each loan
  • However, they don’t have unlimited money, so they have to sell loans as fast as possible
  • People trust theorem, last 5 years, you’ve been positive, so we’ll buy your loans at a slight discount
    • If you don’t have your own model, if you’re not a smart investor, then you can rely on people like Theorem
  • This is the bread and butter of theorem, run the model and get the default rate, check what our expected return from that loan or not, and send the information back to the platform (as fast as possible).
    • If we do it too slowly, then someone else might get the loan, and they’ll get left in the dust
    • This is done with the auction agent
      • Currently turned off because of corona
    • There’s no transaction fees
  • Many platforms sell the loans to the banks
  • Platforms aren’t using things like

Passive Purchasing

  • Acquiring

uid: 202005221033 tags: #theorem #meetings

February 22, 2023

Red-black trees

Follow up to 61B Application 202005111503

Final Presentation: https://docs.google.com/presentation/d/1N-4scA49OzqpBgj9vMNlcxGICdljYSD0U8ERGjdeQ6w/edit#slide=id.p

Binary Search Tree

  • A binary search tree is a tree with one additional constraint — it keeps the elements in the tree in a particular order. Formally each node in the BST has two children (if any are missing we consider it a nil node), a left child and a right child. Nodes are rooted in place based on their values, with the smallest on the left and largest on the right.

2-3-4 Trees

  • B-tree invariants:
    • All leaves must be the same distance from the source
    • A non-leaf node with k-items must have exactly k+1 children.

Red-black trees are structurally identical to a 2-3 tree!

Why red-black tree over 2-3 tree?

  • Inconvenient to manage multiple different types of nodes, need to keep track of how to insert in each of them, etc.

Insertion into a red-black tree

  1. When inserting: Use a red link.
  2. If there is a right leaning 3-node”, we have a Left Leaning Violation.
    • Rotate left the appropriate node to fix.
  3. If there are two consecutive left links, we have an Incorrect 4 Node Violation.
    • Rotate right the appropriate node to fix.
  4. If there are any nodes with two red children, we have a Temporary 4 Node.
    • Color flip the node to emulate the split operation.

Important red-black tree properties

General Notes

  • Each node of the binary tree has an extra bit, and that bit is often interpreted as the color (red or black) of the node. These color bits are used to ensure the tree remains approximately balanced during insertions and deletions.
  • The leaf nodes of red–black trees do not contain data.
    • Can either be null or a sentinel
  • In-order traversal: The search-time results from the traversal from root to leaf, and therefore a balanced tree of n nodes, having the least possible tree height, results in O(log n) search time.

uid: 202005202245 tags: #algorithms

February 22, 2023

I can answer this partly through the lens of eugenics studies. This will at least give an idea of what the intelligentsia thought of race. While the term is mostly associated with Nazi germany, it also played a role for a while in both Soviet Russia and pre soviet Russia and is a useful way to learn about their ideas of race.

To start with pre Soviet Russia, a sample of Russian ideas can first be seen in the 2 Russian men that took part of the International Eugenics Congress in 1912, the anarchist Petr Kropotkin and the journalist Isaak Shklovskii. Both were vehemently opposed. Kropotkin took the expected class angle, while Isaak shared the same viewpoint as most Russian sciences, that there is no such thing as pure races. (there were a few though who disagreed and were proponents of a Great-Russian race).

That said, Russian eugenics instead of racialization focused on the benefit of humankind’. This continued with the rise of the Soviet Union, eugenics became a popular subject in scientific magazines and popular fiction. Soviet eugenics, was because of the political position of the SU isolated from the rest of the field, although contact did occasionally happen. This, in combination with the Russian political background led it to largely eschew race.

This was reflected in a study done by Russian eugenicists in the early 1920s when they started studying the inheritance of human blood groups and started mapping all different blood groups/ethnicities in the country. However, where western eugenicists were concerned with various racial issues, the Soviets rightfully paid no concern to this, believing that there is no such thing as an inborn criminal’. There was no research done into the results of race mixing, or what in other countries was considered lesser races’ as they simply didn’t believe in this!

Rather than races, the Russian intelligentsia believed in the direct inheritance of skills from father to son, from mother to daughter and so on. Thus they believed the intelligent would have intelligent kids, and the artistic would have artistic kids. This still led to somewhat of an elitist bias, but not a racist one.

Obviously they came a bit at odds with the general Soviet sentiment thanks to this, but they managed to survive into Stalin’s Soviet Union. Here its elitist, bourgeois attitude finally became fatal to the discipline and it died in 1948. This due to several factors, the aforementioned, personal conflicts between eugenicists and Stalinists, and comparison with racist nazi policies. Before that however, the scientists still continued their attacks on the fascist and vicious doctrine of racial purity employed by the nazis in the class war.

The SU was from the start a multinational and multi racial state, wherein everyone (officially) had equal rights. Thus the idea of racial division was anathema to the creation of its classless, multiethnic society that constantly denounced the racism of its rivals.

Source: largely summarized from From Beastly Philosophy’ to Medical Genetics: Eugenics in Russia and the Soviet Union’ which also serves as a resource and link to various other papers on eugenics and racial ideas in the Soviet Union. Next article: 202005151931 Part 8 - Soviet union

February 22, 2023

Thank you for your great question. I must humbly admit that I do NOT base it on study of literature, but on my own education / upbringing as a choir musician / conductor, then film historian, in Russia. I am not a complete philistine, I have a PhD in art history (actually film history) from Gerasimov Film Institute, but of course I’m not in any way an authority on Russian Imperial or Soviet ethnic culture. I went off my general gestalt understanding that I sourced off older cultured people around me and literature and many, many fights and arguments in the Russian intellectual space. I have to admit, I’ve sang the Birch” song as a choirboy for YEARS until I even had the thought to ask what the hell is it about. I found out definitively only today.

As I see it, the problem is when something is rigidly regulated, it tends to kill off actual nuanced knowledge. The onus moves to fulfilling the required needs, and fitting the very specific mold. It’s like if a Native American culture had to fit INSIDE a mental image that an average American has about Native American culture. It’s not so much racism as it is reductionism via ignorance and deadlines.

Imperial Russia pointedly did not place much emphasis on even finding out anything about local cultures (after all, all non-Russian citizens there were officially called foreignborn”/“alienborn” - инородцы), it was thoroughly imperialistic in that regard (less so in others, since as OC mentioned its colonies were its integral territories). It was rather tolerant, except for jews - but it was uninterested. Royal/aristocratic Russian-centric culture was the norm.

So for example the question of which songs my Buryatian great-great-great-grandmother sang to her kids was absolutely irrelevant to any government official in the Imperial Russia. Hell, we were slaves for a long time under Russian governorship. In USSR, the mandated behaviour was to cherish ethnic culture, but of course you can’t just spring up mature ethnological practices or personnel out of the blue. Or appreciation for ethnic culture as it is, by a formerly imperial metropolitan audience. I think it was simply what they could do and bothered to do.

The problem is, when your ehtnographic activity is unmandated and unfinanced, it may be fledgling, but the only thing that hinders it is natural causes: urbanization, people dying off, bad memory, unfashionableness (sorry don’t know how to say it). Still, ethnographers can save the nuggets of raw data and organize it in their unpaid hours. But if there is a govt program to generate ethnic culture in an organized matter, there’s no place for ethnographic practice. Like in corporate, it has to be compiled, tidied up, collated, and brought up the chain for approval. As a person who had worked with corporate client, I can say that it tends to wash out things until they’re a shadow of themselves. Made by committee. Next article: 202005151930 Part 7 - Soviet union

February 22, 2023