How to Convert Node.js Buffer to String

by Diogo Kollross

5 min read

Buffer Objects

For a long time, JavaScript was lacking support for handling arrays of binary data. That changed with ECMAScript 2015, when typed arrays were introduced and allowed better handling of those cases.

All these types descend from an abstract TypedArray class, each one specializing in specific integer word sizes and signing, and some providing floating-point support. Float64Array, for example, represents an array of 64-bit floating-point numbers; Int16Array represents an array of 16-bit signed integers.

Of all typed arrays, the most useful probably is Uint8Array. It allows handling binary data such as image files, networking protocols, and video streaming in a way that is similar to how it's done in other programming languages.

In the Node.js world, there's another class, Buffer, that descends from Uint8Array. It adds some utility methods like concat(), allocUnsafe(), and compare(), and methods to read and write various numeric types, mimicking what's offered by the DataView class (eg: readUint16BE, writeDoubleLE, and swap32).

Text Encoding

The subject of encoding text characters using computers is vast and complex and I'm not going to delve deeply into it in this short article.

Conceptually, text is composed of characters: the smallest pieces of meaningful content - usually letters, but depending on the language there may be other, very different symbols. Then each character is assigned a code point: a number that identifies it in a one-to-one relationship.

Finally, each code point may be represented by a sequence of bits according to an encoding. An encoding is a table that maps each code point to its corresponding bit string.

It may look like it's a simple mapping, but several encodings were created in the last decades to accommodate many computer architectures and different character sets. 

The most widespread encoding is ASCII.

It includes the basic Latin characters (A-Z, upper and lower case), digits, and many punctuation symbols, besides some control characters. ASCII is a 7-bit encoding, but computers store data in 8-bit blocks called bytes, so many 8-bit encodings were created, extending the first 128 ASCII characters with another 128 language-specific characters.

A very common encoding is ISO-8859-1, also known by several other names like "latin1", "IBM819", "CP819", and even "WE8ISO8859P1". ISO-8859-1 adds some extra symbols to ASCII and many accented letters like Á, Ú, È, and Õ.

Those language-specific encodings worked reasonably well for some situations but failed miserably in multi-language contexts.

Losing information and data corruption was common because programs tried to manipulate text using the incorrect encoding for a file.

Even showing the file content was a mess, because text files don't include the name of the encoding that was used to encode its content.

Trying to fix that, the Unicode standard was started in the 1980s. Their objective is to map every currently used character in every human language (and some historic scripts) to a single code point.

The Unicode standard also defines a number of generic encodings that are able to encode every Unicode code point.

The most common encodings for Unicode are UTF-16, UCS-2, UTF-32, and UTF-8. Except for UCS-2, which uses a fixed width for all code points (thus preventing it to represent all Unicode characters), all of these encodings use a variable number of bits to encode each code point.

On the web, the most widely used Unicode encoding is UTF-8, because it's reasonably efficient for many uses. It uses 1 byte for the ASCII characters, and 2 bytes for most European and Middle-East scripts - but it's less efficient for Asian scripts, requiring 3 bytes or more. For example, the French saying "Il vaut mieux prévenir que guérir" is equivalent to the following sequence of bytes when it's encoded using UTF-8 (the bytes are shown in hexadecimal):

Ilvautmieuxpré
496c2076617574206d69657578207072c3a9
venirqueguérir
76656e697220717565206775c3a9726972

Convert Buffer to String Node.js

Having said all that, the content of a Buffer may be quickly converted to a String using the toString() method:

const b = Buffer.from([101, 120, 97, 109, 112, 108, 101]);
console.log(b.toString()); // example

Remember that speech about text encodings? Well, the Buffer class uses UTF-8 by default when converting to/from strings, but you can also choose another one from a small set of supported encodings:

const b = Buffer.from([101, 120, 97, 109, 112, 108, 101]);
console.log(b.toString('latin1')); // example

Most of the time, UTF-8 is the best option both for reading and writing. But for completeness, here is the full list of supported encodings in Node.js (as of September/2021) - the names are not case sensitive:

EncodingAccepted aliases
ascii
base64
base64url
hex
latin1binary
ucs2ucs-2
utf8utf-8
utf16leutf-16le

Convert Node.js String to Buffer

It is also possible to convert data in the opposite direction. Starting from a string, you can create a new Buffer from it (if the encoding is not specified, Node.js assumes UTF-8):

const s = Buffer.from('example', 'utf8');
console.log(s); // <Buffer 65 78 61 6d 70 6c 65>

If you need to write text to an existing Buffer object, you can use its write() method:

const b = Buffer.alloc(10);
console.log(b);
// <Buffer 00 00 00 00 00 00 00 00 00 00>

b.write('example', 'utf8');
console.log(b);
// <Buffer 65 78 61 6d 70 6c 65 00 00 00>

You can even set the starting position (offset):

const b = Buffer.alloc(10);
b.write('test', 4, 'utf8');
console.log(b);
// <Buffer 00 00 00 00 74 65 73 74 00 00>

Conclusion

In a nutshell, it's easy to convert a Buffer object to a string using the toString() method. You'll usually want the default UTF-8 encoding, but it's possible to indicate a different encoding if needed. To convert from a string to a Buffer object, use the static Buffer.from() method - again optionally passing the encoding.

If you're a Node.js developer interested in advancing your knowledge, add these posts to your reading list:

FAQs

Q: Is buffer a string?
A string is a sequence of characters, but a buffer is a sequence of bytes. Even though a buffer might contain the encoded content of a string value, it may also encode other kinds of values or any binary data.
Q: What does buffer do in node JS?
Buffers allow storing and manipulating byte arrays, especially when working with files: functions like fs.readFile return Buffer objects when reading binary files. It's also important for functions to handle network communications and image processing.
Q: What is a buffer object in programming?
A buffer object is a way to abstract sequences or arrays of bytes. Besides common array operations like getting and changing one or more elements of the byte array, buffers usually have advanced methods to read and write more complex values such as integers, floating point numbers and strings.
Diogo Kollross
Diogo Kollross
Senior Full-stack Developer

Diogo Kollross is a Full-stack Engineer with more than 14 years of professional experience using many different tech stacks. He likes programming since he was a child, and enjoys good food and traveling.

Expertise
  • NodeJS
  • JavaScript
  • ReactJS
  • Vue.js
  • PHP
  • +1

Ready to start?

Get in touch or schedule a call.