Week 6 Study Summary
Lecture Review
NLP (Posts 1-9)
https://velog.io/@naem1023/series/NLP
Assignment Process / Results
bucketing
 A technique that reorganizes batches based on the length of data. It came up in Assignment 4. The concept itself was easy to understand, but the code was very hard to follow.
I resolved it during the peer session. What the code was trying to do:
- View data lengths in units of max_pad_len.
- For example, set max_pad_len=5 and use the quotient. Then data with lengths 5-9 should be reorganized into a single batch.
- Instead of moving data around to reorganize, store data indices separately and reorganize batches by index.
- Reorganize batches by index so that data with similar lengths are placed adjacently in the list.
- e.g., data with lengths 5-9 are positioned adjacently in the list.
- Once the above process is done, simply reading the index list in order gives you the reorganized batches.
This is what the bucketing code in Assignment 4 contained.
Peer Session Summary
We had extensive discussions about the bucketing problem mentioned above.
We also shared experiences about team formation. Unlike CV, everyone in NLP was very enthusiastic about team formation, so the recruiting process was very fast.
Study Retrospective
21/09/06: Studied Word2Vec, solved Assignment 1 21/09/07: Studied RNN, LSTM, solved Assignments 2, 3 21/09/08: Studied attention, solved Assignment 4 21/09/09: Spent a lot of energy on team formation… 21/09/10: Reviewed Assignment 4