LaaLM
/

LaaLM-v2

@@ -120,21 +120,21 @@ The new runs were not successful.
 But we got real improvements. While the model still says unnecessary things, it now hallucinates much less and does some of the things correctly.
-The run was on an L40S GPU. We don't know why we picked it to be honest.
 But we found out that the data generator is flawed.
-Normally LaaLM-exp-v1 was trained on scenarios where future commands actually got the things that happened from commands before.
-So in the data generator if the generated data did `mkdir hello` in a later command there would be `ls` which would show `hello`. It was done by using a Linux simulator.
 But here LaaLM-v2's data generator randomly generates data without caring about anything before.
-So on it instead it maybe does `cd hello` but then does `touch hi` and goes on randomly generating. No persistence which is literally what LaaLM-v2 is for.
 We will first try using LaaLM-exp-v1's training dataset generator here to see if our theory is correct.
-We will give an update when results are out.
 #### 4 March 2026 Update
@@ -144,10 +144,44 @@ But we thought in the meantime that Transformers are too heavy for what LaaLM do
 So we are going to use a LSTM based architecture.
-LSTMs are not designed for big models. But LaaLM is very simple so a LSTM is better for it.
 But we can assure you that LaaLM-v2 is close to release.
 ---
 ## How to use it

 But we got real improvements. While the model still says unnecessary things, it now hallucinates much less and does some of the things correctly.
+The run was on an L40S GPU. We don't know why we picked it, to be honest.
 But we found out that the data generator is flawed.
+Normally, LaaLM-exp-v1 was trained on scenarios where future commands actually got the things that happened from commands before.
+So in the data generator, if the generated data did `mkdir hello` in a later command, there would be `ls` which would show `hello`. It was done by using a Linux simulator.
 But here LaaLM-v2's data generator randomly generates data without caring about anything before.
+So, on it instead, it may do `cd hello`, but then does `touch hi` and goes on randomly generating. No persistence which is literally what LaaLM-v2 is for.
 We will first try using LaaLM-exp-v1's training dataset generator here to see if our theory is correct.
+We will give an update when the results are out.
 #### 4 March 2026 Update
 So we are going to use a LSTM based architecture.
+LSTMs are not designed for big models. But LaaLM is very simple, so an LSTM is better for it.
 But we can assure you that LaaLM-v2 is close to release.
+#### 6 March 2026 Update
+We have updated the codebase, but we have a problem.
+Speed.
+Speed is crucial to us because we need rapid experimentation. If we take too much time, the cloud bill will explode.
+But that's the problem also.
+We can't find the appropriate accelerator that is exactly what LaaLM needs.
+We have been using an L40S, but we got well fed up with its slowness and ineffectiveness without FP8.
+So we will use another accelerator. But there's not really any price-to-performance accelerator that can fit us.
+For the data generator, we have a weird problem.
+Technically, v2's data generator is actually better than exp-v1.
+But for some reason, there's a leak somewhere we can't figure out that causes the model to just cheat.
+Maybe it's our indicator tokens, or it's something subtle we can't figure out, but LaaLM-v2 definitely got us fed up with the whole LaaLM franchise.
+We also have sad news for LaaLM today.
+LaaLM-v2 will be the last model of the LaaLM series.
+Our reason is that we have better projects to spend our compute on than a bash predictor that any other model can beat.
+LaaLM has definitely been a fun experience for us, but we can't just spend precious compute for something this experimental and non-useful.
+Maybe future models will keep coming but we definitely recommend you can stop expecting new models after LaaLM-v2.
 ---
 ## How to use it