silicon valley stories: May 2020

Thursday, May 28, 2020

Attention Explained

Screenshot of Andrew Ng's explanation of attention model from the deeplearning.ai course

The problem with regular encoder decoder architectures arise when we have long sentences because RNNs dont do well in these scenarios. For eg : while translating a long sentence humans probably dont read the whole sentence and then translate it. The human mind probably reads parts of the sentence and then processes the translation for that part. This leads us to attention models. While translating a word, it weighs in the inputs to the word differently.

References

Encoder Decoder Explained

Neural Machine Translation

x1,x2,....,xTx - input sentence

y1,y2,....yTy - output sentence

Tx!=Ty, which means the length of the input sentence can be different from the output sentence

The problem with regular encoder decoder architectures arise when we have long sentences because RNNs dont do well in these scenarios. For eg : while translating a long sentence we probably dont read the whole sentence and then translate it. The human mind probably reads parts of the sentence and then processes the translation. This leads us to attention models. While translating a word, it weighs in the inputs to the word differently. We will cover attention models in a separate post. We will explore another encoder decoder architecture where the input is an image, hence the encoder produces an image encoding.

Image Caption Generation

Encoder : Alexnet or any other Computer Vision model can generate the image encoding

Decoder : RNN like architecture can decode the encoding to create the image caption

y1, y2..., yT - image encoding

Transformers Explained

References

Wednesday, May 27, 2020

BERT Explained

Monday, May 25, 2020

NLP Primer

Attention

Visualizing a neural machine translation model

Transformers

[June 2017] Paper : Attention is all you need
[April 2018] Explainer blog post : The annotated transformer
[June 2018] Explainer blog post : The illustrated transformer

Bert

XLNet

XLM - Enhancing BERT for cross lingual language model

XLM explainer
How is XLM different from multilingual bert
Paper : Cross lingual language model pretraining
Code : github link

XLM-R

XTREME

Blog : A massively multilingual multi-task benchmark for evaluating cross-lingual generalization

Friday, May 22, 2020

Market, economy and the investors

Market in the short term is the rate of change of news - like a popularity contest
Market takes the elevators on the way down and takes the stairs on the way up
The economy may be doing bad, but the market may be doing well.
Market is forward looking and the economy is current looking
Some job losses may be good for the economy and market, because that may clean up the old economy jobs and the antifragile
Economists may not be great investors
At the bottom of the crisis, any good economist should be able to say all the risks facing the markets and outline why the market may go down
Investing needs a healthy dose of optimism and faith during these tough times
Pessimists might sound smart, but optimists will make money
However, I don't want to imply that being a forever optimist and overlook your risks. Investing is personal and risk is always measured from a frame of reference which is life situation. The answer should be different for everyone. Here is an essay to analyze your mortgage and tail risks

Bulls vs Bears

Psychology of a bear

There is a chance that prices comes down and the bet is that prices will stay down for a long time
Higher probability of buying low because of lower price and larger time
Fear of not being a price taker in a bear market

Psychology of hold

Even if the price goes down, it wont stay down for a long enough duration. So not worth selling
There is a low probability of prices going up in the near term, but if I get out, it may be hard to get back in

Psychology of a bull

Prices will go up in the short to medium term
The current prices are a bargain factoring in future earnings and revenue growth
Even if there is some short term variance, in the medium term the asset will be worth more and there is no reason in the horizon to liquidate this asset

Investing essays during crisis

Musings of an investor during a crisis

Insurance

Insurance is cheap before the crisis (Gold prices)
Once the crisis is clear, insurance becomes expensive (gold prices shot up)
Insurance is not free. Insurance can be a drag on portfolio performance during good times. Hence asset allocation and rebalancing is important.

Range of outcomes

During this covid crisis, the absolute range of outcomes still says vast

Negative outcome

The reopening of the economy can cause a huge second wave, leading us to close back again. That could be devastating for the economy. The Fed may have prevented a short term collapse, but the medium term uncertainty remains. Also that could lead to much more long lasting permanent damage in the economy. At this point, markets are probably pricing in a lost quarter. The crisis started in America in March when cities started to shut down. Hence, Q1 results were not really hampered. But the lockdown effect will be visible in Q2. Market is hoping that the economy opens up and by Q3, results start trending back to normal and we have a great Q4 like last year. I think that is what is baked into the prices and we have already seen a swift recovery from 35% lows.
However, if the reopening is hampered by a huge second wave of the virus, the market would start pricing in more than a quarter and up to a year of lost revenue. Things would be interesting when the market starts to price between outcomes like

A quarter of lost revenue
Multiple quarters of lost revenue
Multiple years of continued revenue compression due to more permanent damage caused by a second wave

Positive outcome

There is a possibility of a vaccine by the end of the year and several companies are already pre-scaling production in anticipating. This would definitely be the fastest vaccine ever
However, if it were to happen, it would be easy to say "long human ingenuity" or this is what you expect to happen if the whole world gets behind one common goal
Don't fight the FED. The Fed has done all that is their in its power to remove tail risks
American capitalism may have changed forever. With the Fed buying high yield bonds, we may be entering an era of Government Sponsored Enterprises.

Both the positive and the negative outcomes are equally likely. At this point, it remains hard to say which one is more likely vs the other and in what timeline. The FED continues to mitigate tail risks in the short term.

Sectors impacted

Travel
Hospitality
Airlines
Cruises
Retail stores
Malls real estates
Oil

Sectors at the risk of contagion

Commercial real estates
Hospital industry
Mortgage banks
Junk Bonds

Commentary on Big Tech

The economy continues to migrate from atoms to bits
Big Tech stock prices indicate that
Silicon valley housing prices still correlated with big basket of tech stocks
User behaviors that would pan out in the next 5 years have been expedited within a span on 2 months
However, more short term revenue may be hit if the more companies get hit (ads could be more vulnerable than cloud revenue followed by retail)

Thursday, May 21, 2020

Mortgage and tail risks

First let me start with what this post is not about. I dont plan to cover whether to pay off a mortgage early or to keep it. There are lot of articles in the internet covering that aspect. I plan to cover how to manage tail risks of owning a home, given real estate prices can be sky high in the coastal areas of the US. How to manage the mortgage dance with a balanced portfolio that can be resilient when tail events do occur.

This post outlines some of the tail risks pertaining to home ownership

Losing job and having too low emergency funds(~2 months) to cover cash
High debt equity ration of the assets (huge mortgage and equity prices crash leading to inability to service mortgage payments). This is more risky in case of multiple mortgages and rental businesses
Losing job and losing immigration status - a reasonable emergency fund(~6 months) may also fall short
Mortgage may go underwater (2008 recession)
Natural calamity (earthquake, cyclone, infectious disease affecting cities - 2020)
Having a huge mortgage towards the end of a short term or long term debt cycle

How can you structure your portfolio to account for such risks

Have significant emergency funds in short and long term treasuries

Account for 1-2 years of mortgage payments on primary residence
Account for 6 months - 1 year of rental payments per unit in case of moratorium on rents
Account for personal and family expenses in case you are out of job and rental income for a while
Account for medical emergencies in case of health insurance loss
This is what a healthy balance sheet looks like. Google had 120bn in cash, Microsoft had 130 billion in cash, Facebook had 60 billion in cash going into the covid crisis. For all these companies it stands for ~10% of their market cap.

Diversify your equity holdings

Dont have stock concentration risk in the similar companies. For eg : google and facebook both make money from ads
Dont have concentration risk through index funds and individual stocks overlapping. For eg : Google Facebook Amazon Microsoft make up ~50% of QQQ and 20% of S&P500
Dont have correlation between index fund and home price. For eg : QQQ and bay area home prices are correlated

Buy insurance when it is cheap and VIX is low

Hold some alternate asset classes like Gold, bitcoin

All weather portfolios. If you had one at the beginning of the crisis, then your portfolio hopefully rose during the crisis. Also you are probably deriving healthy income from your portfolio.

40% Long term treasuries
15% Intermediate term treasuries
7.5% Gold
7.5% Commodities
30% Large Cap Equities

Diversify when it is cheap to build your portfolio for the future. Some areas to look into

Commercial real estate
Emerging markets
Oil

End of short term and long term debt cycles lead to deleveraging across the economy. If you have made money during the current credit cycle expansion, it may make sense to take some chips off the table

PS : the pessimist sounds smart, the optimist makes money

Sunday, May 10, 2020

The future of NLP

This is a summary of huggingface's talk on "The future of NLP"

Overall trends are that model size's are growing too fast. Current state of the art models have 1-10 billion parameters. NLP models cannot be run on custom hardware anymore causing a huge gap between academia and industry
Three pronged approach : distillation, pruning and
Model distillation helps reduce the size of a teacher model by training a student model, without losing much of the predictive power of the teacher model. Knowledge distillation saves on inference and power efficiency.
Pruning works directly on your teacher model and remove weights from your teacher model to make it smaller

Head pruning
Weights pruning
Layer pruning - they are repetation of the same module and they have a shortcut connection. So if you remove a layer it is less aggressive than other architectures without shortcut connections.
GPUs are optimized for dense matrix multiplications. If you use these sparse models on GPUs, they can be 3-4 times slower. Graphcore chips are specially designed for sparse modules

Quantization - NNs works well on int8s and not just floats. Scale all the values from float to int with zero point conversion.
A comparison study of XLNet and Bert with large models
RoBERTa is bert trained on larger data and beat XLNet. Bert - 137 billion tokens(13 GB). RoBERTa - 2.2 trillion tokens(160 GB corpus).
Winnowgrad scheme challenge - you can apply heuristics on the wiki and create artificial augmented datasets that the model can learn on. This process is called finetuning.
Scaling laws for neural language models
Power laws work well, if you reduce the embedding and if you don't take the size of the model. If you double the parameters, dataset size, compute, the model loss function will improve linearly
One of the goals of transfer learning is to make the model work on small dataset
Sample efficiency : how better your model gets with one additional example
The metric to measure this is called "Online Code Length". It is a way to look at model compression
For SQuAD, Bert trained on QA dataset > Bert trained on wikipedia > Bert randomly initialized in terms of sample efficiency
On what language model pretraining captures - oLMpics
What we would like is "out of domain generalization", what we have is "in-domain" generalization
Compositionality

Scan
PCFG dataset

References

Sunday, May 3, 2020

Explainable AI - Interpretable Machine Learning

Reading list for Explainable AI

Friday, May 1, 2020

Lockdown review (Feb 2020 - April 2020)

Stayed home and acquired a lot of knowledge.

Courses taken

Deeplearning.ai - 5 courses by Andrew Ng
Deep Learning with PyTorch - Udacity
Privacy aware ML - Udacity
Re-inforcement learning

Skills acquired

PyTorch
PySyft

Books read

Antifragility
The subtle art of not giving a fuck
The Intelligent Investor
The Manager's path

Blogs read

Ray Dalio's notes on changing world order, debt cycles
Amazon's investor letters
Guggenheims weekly articles

Blogging

Blog reached 21k views within 3 months
Recession call - March 8
Coronavirus lockdown prediction in bay area - March 1
Engineering Manager series published
Coronavirus series published
Financial Investing series published
Deep Learning Learner series published

I also enabled Amazon shopping ads on the blog because I started book reviews. Here is how that started

How would you rate my lockdown till now ?

silicon valley stories

Pages

Thursday, May 28, 2020

Attention Explained

Encoder Decoder Explained

Neural Machine Translation

Image Caption Generation

Transformers Explained

Wednesday, May 27, 2020

BERT Explained

Monday, May 25, 2020

NLP Primer

Friday, May 22, 2020

Market, economy and the investors

Bulls vs Bears

Psychology of a bear

Psychology of hold

Psychology of a bull

Investing essays during crisis

Insurance

Range of outcomes

Negative outcome

Positive outcome

Sectors impacted

Sectors at the risk of contagion

Commentary on Big Tech

Thursday, May 21, 2020

Mortgage and tail risks

Sunday, May 10, 2020

The future of NLP

Sunday, May 3, 2020

Explainable AI - Interpretable Machine Learning

Friday, May 1, 2020

Lockdown review (Feb 2020 - April 2020)

Books I am reading