How to Check if String Contains Chinese Characters in Javascript

In this tutorial, you will learn how to check if string contains Chinese characters in javascript. Since some other languages such as Japanese and Korean include Chinese characters and derivatives in their writing systems, we need to look of CJK characters in a string. CJK is a collective term for Chinese, Japanese, and Korean languages.

The Unicode character range is given below.

  • U+3040 – U+30FF: Katakana and Hiragana (Japanese only)
  • U+3400 – U+4DBF: Extension of CJK A Unified Ideographs (Chinese, Japanese, and Korean)
  • U+4E00 – U+9FFF: CJK Unified Ideographs (Chinese, Japanese, and Korean)
  • U+F900 – U+FAFF: CJK Compatibility Ideographs (Chinese, Japanese, and Korean)
  • U+FF66 – U+FF9F: Half-width Katakana (Japanese only)

There are numerous ways to check if a string contains Chinese characters. But for the sake of simplicity, we will use the regular expression and ternary operator (?) to accomplish our goal. The test() method of RegExpObject is used to perform a pattern search in a string and returns a Boolean value.

In the following example, we have one global variable that holds a string. Upon click of a button, we will check if the string contains Chinese characters and display the result on the screen.  Please have a look over the code example and the steps given below.

HTML & CSS

  • We have 3 elements in the HTML file (div, button, and h1). The div element is just a wrapper for the rest of the elements.
  • The innerText for the button element is “Check” and for the h1 element, it is “Result”.
  • We have done some basic styling using CSS and added the link to our style.css stylesheet inside the head element.
  • We have also included our javascript file script.js with a script tag at the bottom.
<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <meta http-equiv="X-UA-Compatible" content="ie=edge">
  <link rel="stylesheet" href="style.css">
  <title>Document</title>
</head>

<body>

  <div class="container">    
    <button>Check</button>
    <h1>Result</h1>
  </div>

  <script src="script.js"></script>
</body>

</html>
.container {        
    text-align: center;
}

button {
  margin-top: 10px;
  padding: 10px 20px;
}

Javascript

  • We have selected the button element and h1 element using the document.querySelector() method and stored them in btnCheck and output variables respectively.
  • We have attached a click event listener to the button element.
  • We have a global variable myString which holds a string as its value.
  • We have the regExp variable which holds a regex pattern as its value to match CJK characters.
  • In the event handler function, we are calling the test() method of regExp and passing myString as a parameter. It will return a Boolean value which we are storing in the found variable.
  • We are using the ternary operator (?) and checking whether found is true or false. Depending upon the result of the check, we will assign “Yes” or “No” to the result variable.
  • We are displaying the result in the h1 element using the innerText property.
let btnCheck = document.querySelector("button");
let output = document.querySelector("h1");

let myString = "你好世界";
let regExp = /[\u3040-\u30ff\u3400-\u4dbf\u4e00-\u9fff\uf900-\ufaff\uff66-\uff9f]/g;

btnCheck.addEventListener("click", () => {

  let found = regExp.test(myString)
  let result = found ? "Yes" : "No";
  output.innerText = result;
  
});