Through competitive programming, let's make Japan a powerhouse of highly skilled IT professionals! [1/9]

Published Date: 2019/10/03

To understand the true shortage of IT talent, what is truly needed

Naohiro Takahashi

AtCoder Inc.

#Marketing#Management/Business#Digital/Technology#Programming#Engineer#AI#IT#Recruitment#Human Resource Development

By 2030, Japan is projected to face a shortage of hundreds of thousands of highly skilled IT professionals. But did you know Japan actually has many highly skilled IT professionals with world-class programming abilities?

In this series, Naohiro Takahashi, CEO of the competitive programming contest AtCoder, explains the recruitment and development of highly skilled IT professionals—a factor now critical enough to sway corporate performance. What is the state of Japan's highly skilled IT workforce as seen by Mr. Takahashi, himself a world-class programmer?

Will "Traditional IT Talent" Become Surplus? What IT Talent is Truly Needed?

Hello, I'm Takahashi from AtCoder. As you all know, the AI industry is attracting significant attention, and companies are scrambling to recruit top engineers. Considering the trend of replacing all kinds of work with AI, AI talent is now in demand not just by IT companies, but by all kinds of businesses. In Japan, it's said that there will be a shortage of hundreds of thousands of AI talent and highly skilled IT professionals in the coming decades, making talent development an urgent priority.

On the other hand, the government has also announced that traditional IT talent may become surplus in the future, creating confusion about what kind of talent is actually needed. What is needed now is for executives, HR personnel, and those around them to possess the foundational knowledge to properly consider "What is AI talent?" and "What is advanced IT talent?"

...This series will explore such topics. However, as a preliminary step, I want to delve beyond superficial understanding and provide a deeper grasp of the fundamental question: "What is programming, fundamentally?"

The truth is, most people don't really understand it. What is programming?

Every computer in the world operates based on programs written in programming languages. Search engines for web pages, smartphone interfaces, and all software run because of programming.

Yet, the overwhelming majority of people don't know what programming actually entails. In that state, simply throwing around terms like "AI," "algorithm," or "deep learning" makes it difficult to arrive at a correct understanding.

First, let's try to get a simple idea of what programming actually is.

プログラミング1

This is a snippet of code written in the C# programming language, meaning "Display 'Hello World' on the screen!" This string of text is converted into a program through a process called compilation. When executed, it displays a screen like the one below.

Programming is essentially this: you describe what you want the computer to do, and it executes it exactly as instructed. This example was simple, but the more complex the program, the more code you need to write.

The source code has become a bit more complex. Source code refers to text written in a programming language. Since this article isn't a programming tutorial, you don't need to learn the syntax or writing style of the C# programming language. Just read the Japanese parts and understand that they correspond one-to-one with the program.

Here is the execution screen. It first displays "Start," then enters a loop that repeats five times. The program checks whether the current iteration number is even or odd, displaying "Even" if it is even and "Odd" otherwise. Finally, it displays "End" and finishes. The goal is for you to see that the Japanese written in the source code comments matches the text displayed when the program runs.

As you can see, programming involves combining "loops" (in this case, repeating a process five times) and "conditional branching" (in this case, checking whether the number is even) to describe, step by step, what you want the computer to do.

The program shown here only displays text on the screen. However, in actual programming, there are "functions" available for various tasks: displaying images, showing clickable buttons, specifying their positions, connecting to networks, and more.

By combining these functions and writing thousands or even tens of thousands of lines of source code, the software you use is completed.

What exactly is an "algorithm," the core of a program?

Now that we understand what programming is, you might be wondering, "Sure, the source code might get long, but isn't it just writing down what you want to do step by step?"

However, one of the challenging aspects of programming is that even for programs with the same goal, the "algorithm" used can significantly impact the program's speed and accuracy.

An algorithm, translated into Japanese as "算法," refers to the specific steps and rationale for implementing a program. ...Even so, it might be hard to visualize, so let's understand what an algorithm is using a number guessing game as an example.

The number-guessing game works like this:

① Person A imagines any integer between 1 and 1000.
② Player B guesses the number Player A is thinking of.
③ Person A tells Person B whether their guess is correct, too high, or too low.
④ Repeat steps ② and ③ until the correct number is guessed.

After imagining the number, what Person A does next is mechanically determined. If we were to write Person A's actions as a program, it might look something like this? The program gets increasingly complex, so for now, just understanding the Japanese parts is fine!

cord2

Now, with this in mind, let's consider "how Person B makes their guesses."

The simplest method is to start low and work up: 1, 2, 3... By incrementally increasing the guessed number, they can reliably reach the correct answer. If we program this, it might look like this?

cord3

The resulting source code is very simple and easy to understand. But is this program really the best approach?

Of course, this is a very bad approach. The reason is simple: guessing "500" from the start reduces the number of guesses far more than starting with "1".

If you guess 1 and miss, you know the value is "between 2 and 1000." You've only narrowed the range by one number.

In contrast, if you guess 500 and miss, you learn whether the value is "between 1 and 499" or "between 501 and 1000." Whether it's "greater than" or "less than" 500, the range narrowed by 501. With just one question, you've halved the possible range of values Person A might be thinking of.

By repeating this process of narrowing the range by half with each guess, you can arrive at the correct answer very quickly.

Conventional method:
イメージ画像1
Current Method:
イメージ画像2

Programmed, it would look something like this, though it's become quite complex and probably difficult for anyone other than a programmer to understand. Don't worry if you don't understand it.

Now, how much difference do these two programs make? The first method requires up to 1000 guesses in the worst case, while this method guarantees a hit with only 10 guesses at worst.

This time, since the range of possible values is 1 to 1000, the difference is only 100-fold. However, if the range were 1 million, the difference would be 50,000-fold, and if it were 1 trillion, the difference would be 25 billion-fold.

This lengthy explanation illustrates the "difference born from differing algorithms."

Both programs shared the same goal: predicting the value imagined by Person A. However, their approaches to achieving this goal differed, leading to significant variations in program efficiency and source code content.

To summarize, the part that defines "what approach to take to achieve the goal" is the algorithm.

Is there a connection between algorithms and real-world problems?

Honestly, many people who see the number-guessing game probably think, "What does this have to do with AI?" Even when I talk about this in lectures, many people perceive it as just a game.

Even if you know an algorithm itself, mastering how to apply it effectively is extremely difficult.

The algorithm mentioned earlier, "guessing from the middle," is actually the well-known "binary search" algorithm. However, few people can explain how to apply it in the real world.

Yet, the reality is that it's precisely because we benefit from these algorithms that we can comfortably use computers and the internet.

For example, consider the following English sentence:

"The word algorithm has its roots in Latinizing the name of Muhammad ibn Musa al-Khwarizmi in a first step to algorismus."

If you wanted to find the part containing the string "name" from this, how would you go about it?

If you think about it normally, you'd start from the beginning, checking "The isn't name," "word isn't name..." and so on, and when you finally reach "name," you'd think, "There it is!"

String searches are generally built using this kind of "start-from-the-beginning" algorithm. But what if this string were very long? It's not hard to imagine that the longer it is, the more difficult the search becomes.

Now, what if the English sentence from earlier had been given in this structure?

【A】
a[17], al-Khwarizmi[15], algorismus[21], algorithm[3], ibn[13], in[7], in[16], its[5], first[18], has[4], Latinizing[8], Muhammad[12], Musa[14], name[10], of[11], roots[6], step[19], The[1], the[9], to[20], word[2]

Just looking at this probably doesn't make sense, so I'll explain a bit more. First, I'll add "the position of this word" for each word in the English sentence from earlier.

【B】
The[1] word[2] algorithm [3] has[4] its[5] roots[6] in[7] Latinizing[8] the[9] name[10] of[11] Muhammad[12] ibn[13] Musa[14] al-Khwarizmi[15] in[16] a[17] first[18] step[19] to[20] algorismus[21]

After this, if we rearrange each word in dictionary order (the order they appear earlier in a dictionary), it becomes like the earlier [A].

Now, what happens when we arrange the words in dictionary order like this? It turns out to have the exact same structure as the number guessing game we did earlier!

Since this sentence has 21 words, for binary search we focus on the middle word, the 11th one. In [A], the 11th word is "Latinizing," which comes "before" the target word "name" in dictionary order (ABC order). This means that if "name" is in this sentence, it must be "after" the middle, meaning it can only be in words 12 through 21.

Systems that scale this algorithm to billions of words are precisely the search engines you use daily, like Google Search. By pre-sorting the vast, scattered texts of the world, they enable high-speed searches without examining every single document.

Of course, Google Search isn't built solely on this algorithm. The actual service running involves far more complex and numerous refinements, too extensive to cover here. But I hope you can now grasp how this "guessing game" approach is actually useful in real programming!

This algorithmic way of thinking is essential and not dependent on specific programming languages. While coding styles and development methodologies quickly become outdated with changing times, the task of "thinking through logic" remains absolutely necessary regardless of the era. Of course, whether mathematical ingenuity or algorithmic ingenuity is required depends on the situation. However, in today's world where various elements are increasingly becoming IT-driven, the need for this kind of thinking will only continue to grow.

Next time, we'll discuss the connection between AI talent and algorithms, as well as the skill level of Japanese programmers.