Thursday, May 28, 2020

Attention Explained


Screenshot of Andrew Ng's explanation of attention model from the deeplearning.ai course




The problem with regular encoder decoder architectures arise when we have long sentences because RNNs dont do well in these scenarios. For eg : while translating a long sentence humans probably dont read the whole sentence and then translate it. The human mind probably reads parts of the sentence and then processes the translation for that part. This leads us to attention models. While translating a word, it weighs in the inputs to the word differently. 


References

Encoder Decoder Explained

Neural Machine Translation

x1,x2,....,xTx - input sentence
y1,y2,....yTy - output sentence
Tx!=Ty, which means the length of the input sentence can be different from the output sentence


The problem with regular encoder decoder architectures arise when we have long sentences because RNNs dont do well in these scenarios. For eg : while translating a long sentence we probably dont read the whole sentence and then translate it. The human mind probably reads parts of the sentence and then processes the translation. This leads us to attention models. While translating a word, it weighs in the inputs to the word differently. We will cover attention models in a separate post. We will explore another encoder decoder architecture where the input is an image, hence the encoder produces an image encoding. 



Image Caption Generation

Encoder : Alexnet or any other Computer Vision model can generate the image encoding
Decoder : RNN like architecture can decode the encoding to create the image caption
y1, y2..., yT - image encoding


Transformers Explained



References

Friday, May 22, 2020

Market, economy and the investors

  1. Market in the short term is the rate of change of news - like a popularity contest
  2. Market takes the elevators on the way down and takes the stairs on the way up
  3. The economy may be doing bad, but the market may be doing well. 
  4. Market is forward looking and the economy is current looking
  5. Some job losses may be good for the economy and market, because that may clean up the old economy jobs and the antifragile
  6. Economists may not be great investors
  7. At the bottom of the crisis, any good economist should be able to say all the risks facing the markets and outline why the market may go down
  8. Investing needs a healthy dose of optimism and faith during these tough times
  9. Pessimists might sound smart, but optimists will make money
  10. However, I don't want to imply that being a forever optimist and overlook your risks. Investing is personal and risk is always measured from a frame of reference which is life situation. The answer should be different for everyone. Here is an essay to analyze your mortgage and tail risks

Bulls vs Bears

Psychology of a bear

  1. There is a chance that prices comes down and the bet is that prices will stay down for a long time
  2. Higher probability of buying low because of lower price and larger time
  3. Fear of not being a price taker in a bear market

Psychology of hold

  1. Even if the price goes down, it wont stay down for a long enough duration. So not worth selling
  2. There is a low probability of prices going up in the near term, but if I get out, it may be hard to get back in 

Psychology of a bull

  1. Prices will go up in the short to medium term
  2. The current prices are a bargain factoring in future earnings and revenue growth
  3. Even if there is some short term variance, in the medium term the asset will be worth more and there is no reason in the horizon to liquidate this asset

Investing essays during crisis

Musings of an investor during a crisis

Insurance

  1. Insurance is cheap before the crisis (Gold prices)
  2. Once the crisis is clear, insurance becomes expensive (gold prices shot up)
  3. Insurance is not free. Insurance can be a drag on portfolio performance during good times. Hence asset allocation and rebalancing is important.

Range of outcomes


During this covid crisis, the absolute range of outcomes still says vast

Negative outcome

  1. The reopening of the economy can cause a huge second wave, leading us to close back again. That could be devastating for the economy. The Fed may have prevented a short term collapse, but the medium term uncertainty remains. Also that could lead to much more long lasting permanent damage in the economy. At this point, markets are probably pricing in a lost quarter. The crisis started in America in March when cities started to shut down. Hence, Q1 results were not really hampered. But the lockdown effect will be visible in Q2. Market is hoping that the economy opens up and by Q3, results start trending back to normal and we have a great Q4 like last year. I think that is what is baked into the prices and we have already seen a swift recovery from 35% lows. 
  2. However, if the reopening is hampered by a huge second wave of the virus, the market would start pricing in more than a quarter and up to a year of lost revenue. Things would be interesting when the market starts to price between outcomes like
    1. A quarter of lost revenue
    2. Multiple quarters of lost revenue
    3. Multiple years of continued revenue compression due to more permanent damage caused by a second wave 

Positive outcome

  1. There is a possibility of a vaccine by the end of the year and several companies are already pre-scaling production in anticipating. This would definitely be the fastest vaccine ever
  2. However, if it were to happen, it would be easy to say "long human ingenuity" or this is what you expect to happen if the whole world gets behind one common goal
  3. Don't fight the FED. The Fed has done all that is their in its power to remove tail risks
  4. American capitalism may have changed forever. With the Fed buying high yield bonds, we may be entering an era of Government Sponsored Enterprises. 
Both the positive and the negative outcomes are equally likely. At this point, it remains hard to say which one is more likely vs the other and in what timeline. The FED continues to mitigate tail risks in the short term. 

Sectors impacted 

  1. Travel
  2. Hospitality
  3. Airlines
  4. Cruises
  5. Retail stores
  6. Malls real estates
  7. Oil

Sectors at the risk of contagion

  1. Commercial real estates
  2. Hospital industry
  3. Mortgage banks
  4. Junk Bonds

Commentary on Big Tech

  1. The economy continues to migrate from atoms to bits
  2. Big Tech stock prices indicate that
  3. Silicon valley housing prices still correlated with big basket of tech stocks
  4. User behaviors that would pan out in the next 5 years have been expedited within a span on 2 months
  5. However, more short term revenue may be hit if the more companies get hit (ads could be more vulnerable than cloud revenue followed by retail)

Thursday, May 21, 2020

Mortgage and tail risks


First let me start with what this post is not about. I dont plan to cover whether to pay off a mortgage early or to keep it. There are lot of articles in the internet covering that aspect. I plan to cover how to manage tail risks of owning a home, given real estate prices can be sky high in the coastal areas of the US. How to manage the mortgage dance with a balanced portfolio that can be resilient when tail events do occur.

This post outlines some of the tail risks pertaining to home ownership
  1. Losing job and having too low emergency funds(~2 months) to cover cash
  2. High debt equity ration of the assets (huge mortgage and equity prices crash leading to inability to service mortgage payments). This is more risky in case of multiple mortgages and rental businesses
  3. Losing job and losing immigration status - a reasonable emergency fund(~6 months) may also fall short
  4. Mortgage may go underwater (2008 recession)
  5. Natural calamity (earthquake, cyclone, infectious disease affecting cities - 2020)
  6. Having a huge mortgage towards the end of a short term or long term debt cycle

How can you structure your portfolio to account for such risks
  1. Have significant emergency funds in short and long term treasuries
    1. Account for 1-2 years of mortgage payments on primary residence
    2. Account for 6 months - 1 year of rental payments per unit in case of moratorium on rents
    3. Account for personal and family expenses in case you are out of job and rental income for a while
    4. Account for medical emergencies in case of health insurance loss
    5. This is what a healthy balance sheet looks like. Google had 120bn in cash, Microsoft had 130 billion in cash, Facebook had 60 billion in cash going into the covid crisis. For all these companies it stands for ~10% of their market cap. 
  2. Diversify your equity holdings 
    1. Dont have stock concentration risk in the similar companies. For eg : google and facebook both make money from ads
    2. Dont have concentration risk through index funds and individual stocks overlapping. For eg : Google Facebook Amazon Microsoft make up ~50% of QQQ and 20% of S&P500
    3. Dont have correlation between index fund and home price. For eg : QQQ and bay area home prices are correlated
  3. Buy insurance when it is cheap and VIX is low
    1. Hold some alternate asset classes like Gold, bitcoin
  4. All weather portfolios. If you had one at the beginning of the crisis, then your portfolio hopefully rose during the crisis. Also you are probably deriving healthy income from your portfolio. 
    1. 40% Long term treasuries
    2. 15% Intermediate term treasuries
    3. 7.5% Gold
    4. 7.5% Commodities
    5. 30% Large Cap Equities
  5. Diversify when it is cheap to build your portfolio for the future. Some areas to look into
    1. Commercial real estate
    2. Emerging markets
    3. Oil
  6. End of short term and long term debt cycles lead to deleveraging across the economy. If you have made money during the current credit cycle expansion, it may make sense to take some chips off the table
PS : the pessimist sounds smart, the optimist makes money

Sunday, May 10, 2020

The future of NLP

This is a summary of huggingface's talk on "The future of NLP"

  1. Overall trends are that model size's are growing too fast. Current state of the art models have 1-10 billion parameters. NLP models cannot be run on custom hardware anymore causing a huge gap between academia and industry
  2. Three pronged approach : distillation, pruning and 
  3. Model distillation helps reduce the size of a teacher model by training a student model, without losing much of the predictive power of the teacher model. Knowledge distillation saves on inference and power efficiency. 
  4. Pruning works directly on your teacher model and remove weights from your teacher model to make it smaller
    1. Head pruning
    2. Weights pruning
    3. Layer pruning - they are repetation of the same module and they have a shortcut connection. So if you remove a layer it is less aggressive than other architectures without shortcut connections. 
    4. GPUs are optimized for dense matrix multiplications. If you use these sparse models on GPUs, they can be 3-4 times slower. Graphcore chips are specially designed for sparse modules
  5. Quantization - NNs works well on int8s and not just floats. Scale all the values from float to int with zero point conversion. 
  6. A comparison study of XLNet and Bert with large models 
  7. RoBERTa is bert trained on larger data and beat XLNet. Bert - 137 billion tokens(13 GB). RoBERTa - 2.2 trillion tokens(160 GB corpus). 
  8. Winnowgrad scheme challenge - you can apply heuristics on the wiki and create artificial augmented datasets that the model can learn on. This process is called finetuning. 
  9. Scaling laws for neural language models 
  10. Power laws work well, if you reduce the embedding and if you don't take the size of the model. If you double the parameters, dataset size, compute, the model loss function will improve linearly 
  11. One of the goals of transfer learning is to make the model work on small dataset
  12. Sample efficiency : how better your model gets with one additional example
  13. The metric to measure this is called "Online Code Length". It is a way to look at model compression
  14. For SQuAD, Bert trained on QA dataset > Bert trained on wikipedia > Bert randomly initialized in terms of sample efficiency
  15. On what language model pretraining captures - oLMpics
  16. What we would like is "out of domain generalization", what we have is "in-domain" generalization
  17. Compositionality 
    1. Scan
    2. PCFG dataset
  18. sd
References

Friday, May 1, 2020

Lockdown review (Feb 2020 - April 2020)


Stayed home and acquired a lot of knowledge.
  • Courses taken
    • Deeplearning.ai - 5 courses by Andrew Ng
    • Deep Learning with PyTorch - Udacity
    • Privacy aware ML - Udacity
    • Re-inforcement learning 
  • Skills acquired
    • PyTorch
    • PySyft
  • Books read
    • Antifragility
    • The subtle art of not giving a fuck
    • The Intelligent Investor
    • The Manager's path
  • Blogs read
    • Ray Dalio's notes on changing world order, debt cycles
    • Amazon's investor letters
    • Guggenheims weekly articles
  • Blogging
    • Blog reached 21k views within 3 months
    • Recession call - March 8
    • Coronavirus lockdown prediction in bay area - March 1
    • Engineering Manager series published
    • Coronavirus series published
    • Financial Investing series published
    • Deep Learning Learner series published

I also enabled Amazon shopping ads on the blog because I started book reviews. Here is how that started



How would you rate my lockdown till now ? 

Books I am reading