After the meeting, Lexus followed Noelle out of the room and headed downstairs towards the common area. While they were walking, Noelle was telling Lexus about the members of the team and their projects, and about Professor Milton.
“How is Professor Milton usually like?” Lexus asked.
“The professor's normally pretty casual as long as you do your work, so don’t worry too much!” She said as they made their way down the stairs.
Stopping dead in her tracks, she turned back to look at Lexus. “You do not want to make her angry though.”
Lexus nodded. “I’ll keep that in mind.”
As they arrived at the common area, a much more familiar space, Noelle made her way straight towards the sofas at the far corner of the room and sat down with a plop. “I’m so glad the sofas aren’t taken! They were empty on my way up to the office, and I prayed so hard during the meeting for them to stay unoccupied,” she said.
The system said in his head, “Wait, she wasn’t even listening?”
Amused, Lexus let out a light laugh and sat down on the sofa opposite to the one Noelle was sitting on.
Lexus looked around the common area. He could see only three groups of people in the room, with one group sitting in the opposite corner while the other two groups were clustered more towards the middle. They all sat on wooden chairs, with multiple laptops resting on each table.
Lexus himself wasn’t one to choose the sofas - he often did his work on pen and paper, and it would be awfully inconvenient without a table to work on. On this occasion though, he didn’t really mind.
It wouldn’t be so bad to occasionally chat with friends on the sofa. Or maybe even do a weekend programming project with the computer science guys, with snacks and all. I’ll have to talk to Arthur about that.
Noelle's voice interrupted his train of thought. “So, how did you find the meeting?”
Lexus replied, “It was interesting! There was a lot I didn’t understand though. What did you mean by transformer and LSTM hybrids?”
Noelle tilted her head in confusion. “Transformer LSTM hybrids? Oh- you mean my idea about using transformer encoders and LSTM decoders for our project?”
Lexus nodded. “Yeah, I think that was it. It feels like I understand what each word means individually, but I have no idea what you mean when you put them together.”
Noelle chuckled. “Hahaha, that’s a pretty funny way to put it! To be honest, I’m surprised that you’ve even heard of concepts like encoding and decoding. Have you come across them before?”
Lexus was about to answer her when Noelle stopped him. “Wait wait, I have a great idea. How about you try to explain to me what you understand about encoding, decoding and their roles in neural machine translation? Then I can build on that to explain the transformer LSTM hybrid you were asking about.”
Lexus had no idea where the conversation was headed, but he decided to go with the flow. “Hold on, let me take out my pen and paper…”
Noelle leaped out of her own sofa and sat on the same side as Lexus, clearing out some space in the middle for him to place his stack of paper.
Lexus laid the paper flat on the surface of the sofa in between the two of them. They both leaned over the stack of paper as Lexus started to write.
“Before you misunderstand, let me just point out that there’s a ring on her finger,” the voice in his head said.
Lexus immediately froze in place, his pen barely touching the surface of the paper.
“Hmmm? What’s wrong?” Noelle asked.
Lexus looked up at Noelle and quickly replied, “Huh? Oh, it’s nothing.”
Dang it system, I’ll get you back later for this.
Lexus focused his attention back on the topic at hand, which was to explain the concept of encoders and decoders in a way that didn't make him seem like a complete idiot in front of a PhD student.
As per usual, whenever he had doubts about what to do, Lexus started with an example.
“Let’s say we want to translate the sentence ‘Knowledge is Power’ from English to French. Then what we want to end up with is ‘la connaissance est le pouvoir’.“
knowledge is power → la connaissance est le pouvoir
“There’s a problem though: the English version of the sentence has three words, while the French version has five. This means that we can’t simply translate the sentence word for word by looking it up in a dictionary. It gets even worse for long or complex sentences where you might have to swap the order of the words in order for the sentence to be coherent in the target language,” Lexus continued.
“So what we can do is to encode the original sentence into some sort of representation that captures its meaning directly using numbers and vectors.” Lexus paused to looked up at Noelle. “Is it okay if I explain the encoding process of a recurrent neural network? It’s the only one I intuitively know…”
Did you know this story is from Royal Road? Read the official version for free and support the author.
Noelle replied, “Of course! Do it however you want.”
Lexus looked back down and drew out the way that a recurrent neural network would encode a piece of text: word by word. “In recurrent neural networks, the model would take the encoding of all the previous words as well as the next word in the sentence in order to output the encoding of all the words up to the current one, and would repeat this process until the end of the sentence was reached,” he explained.
[https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Felizabeth%2FNwejUh_sGI.drawio.png?alt=media&token=8e593ba4-6269-49d5-9e27-30850cdc5b29]
“An encoder helps us to abstract the problem away from word to word translation and instead focuses on the meaning of the sentence itself, which makes for better translations,” he continued.
“Mmm-hmm, and then what does the decoder do?” Noelle prodded.
Lexus drew another arrow from the encoded text and wrote the translated version of the sentence next to it. “The decoder takes the encoded vector and uses it to predict what the words of the translated sentence would be, word by word.”
[https://firebasestorage.googleapis.com/v0/b/firescript-577a2.appspot.com/o/imgs%2Fapp%2Felizabeth%2Fxb_6BujvGa.drawio.png?alt=media&token=b584f114-569c-4c21-a17c-f84996a47dae]
“This completes the encoding decoding process of machine translation,” Lexus concluded.
Phew, I’m glad that went smoothly.
“That’s really good!” Noelle smiled. “You explained it really well, probably even better than I could have!”
The system added, “Yeah, even I could understand it without any problems.”
System is not a machine learning expert, confirmed.
“Now, let me explain what a Transformer-LSTM hybrid is. As your senior, I have to at least match your quality of explanation!” She pulled out a fresh piece of paper from beneath the one Lexus was writing on and placed it on top. Lexus handed her the pen he was holding, and she started to write.
“A transformer encoder LSTM decoder hybrid is exactly what it sounds like: instead of using simple recurrent neural networks to encode and decode sequences of text, it uses a transformer to encode and an LSTM to decode,” she said.
The system was blunt in its reply, “You don’t say.”
“There’s a paper that shows that combining the two model architectures improves overall performance, which I’ll send to you later for you to read,” Noelle continued. “In theory though, it’s because the transformer is better at extracting context from a given text due to its self attention mechanism. However, it’s not that helpful when decoding because decoding requires sequential knowledge of the previous translated words in order for the translated sentence to make coherent sense.”
Lexus nodded. “Yeah, a transformer probably wouldn’t perform well at sequential tasks compared to LSTMs, which have sequential knowledge inherently built in.”
Noelle tapped the pen on the piece of paper in recognition. “Exactly! LSTMs, which are a specialised form of recurrent neural networks, would therefore perform better at decoding.”
“That makes sense,” Lexus said. “No wonder the hybrid architecture is popular. It combines the strengths of the transformer and LSTMs to create a better machine learning model than what each architecture could possibly do on their own.”
Ding!
---
Computer Science EXP +2
---
Wow, today’s a great day. I’ve been learning like crazy! He had never earned so much experience in a single morning before, not even during supervisions with Professor Emerson.
“As for your homework for today, how about you try to implement a transformer using Python?” Noelle asked.
“Huh?” Lexus was taken aback by Noelle’s suggestion. He had implemented a machine learning model before, but that was done with a very simple recurrent neural network model, not a transformer. Also, he’d never tried to implement a translation model before. Just the encoding and decoding alone would probably send his head spinning, not to mention the actual training of the model.
Is this even possible with his current skill and knowledge? Lexus doubted it, and he said as much. “It sounds pretty advanced… wouldn’t it be too hard for me?”
“Don’t worry, it’ll be fine! It won’t be as difficult as you expect. And there’s always Google to turn to if there’s something you don’t know.” Noelle gave an encouraging smile. “Try it out for a week and let me know how it goes. If you still think it’s too hard, then we can do something else. Who knows, maybe you’ll find it super fun!”
Noelle’s words dissipated much of the concern in Lexus’ mind. If it was just a week, he could give it a shot. He just didn’t want to waste Noelle’s time when he couldn’t produce anything of value after two months of work. If he thought of the “homework” not as a part of work but as a personal project, then it sounded very fun indeed!
“Yeah, I’ll give it a try!” Lexus decided.
Noelle’s smile grew wider as she said, “Great, I’m glad you’re up for it! Let me know if you have any questions while implementing the transformer. There’s also a reading list for the project that I’ll send to you later today via email. What’s your email handle?”
Lexus wrote his email down and handed the piece of paper over to Noelle, who folded it and placed it in her pocket.
“The programming and paper reading would probably keep you busy for the next week, so we can discuss the rest of the details after the next team meeting. But just to give you the basics, the project we’re working on together would probably take about two to three months. The paper writing would probably stretch into your exam season, but you’ll be super busy then so you can count on me for that,” Noelle said.
Two to three months from now would be about May to June, which was exactly the period of time where everyone would be frantically studying for their coming exams.
Hmmm, I guess I’ll have to start revising early. Lexus didn’t want to miss out on the paper writing process because of his exams, so he was determined to find a way to balance both academics and research at the same time.
Hopefully, the time he spent reading all the books and raising his experience to Level 1 would come in handy. Didn’t the system say at some point that the best student in his year was barely at the standard of level 1? That was a few months ago though, so it might not be true now.
At the very least, Lexus could count on [Increased Processing Speed] in order to revise the first year content in less time.
Speaking of which, the amount of effort Lexus was putting into learning the first year computer science content was steadily decreasing as the months passed by. He had opted to spend more time reading more advanced textbooks and papers in his free time, and it had worked out great! As long as he focused during lectures, Lexus found that he no longer had any difficulty completing the problem sheets and programming exercises.
Lexus took a moment to reflect on the progress he had made in a few short months. The me from a few months ago would be ecstatic at the thought of breezing through first year. But now, my mind is so preoccupied with the more advanced computer science and maths that class no longer satisfies me. I'm far from content with staying at my current level when there is still so much to learn!