What are XML, JSON, and YAML Serialization Formats for AI Applications?



Original Source Here

What are XML, JSON, and YAML Serialization Formats for AI Applications?

Data formats types in network automation and machine learning

Different serialization formats. An image by the Author

Introduction

Serialization is a type of language format that is used to convert the object and transmit it to a file, database, or stream of bytes. When we want to send data from one machine to another machine, we need the data to be in binary representation. These language formats are easy to send information as they are in key-value pairs.

Types of Data structures

  1. Structured data: Sometimes it is also called relational data as the data share the same format and schema of rows and columns. Structured query language (SQL) is used to find the data with simple queries.
  2. Semi-structured data: In this type, the data is not relatable as they are not stored in rows and columns schema. The data uses tags, indentation, or colon for key-value pairs for data hierarchy in organizations. Here, data serialization plays an important role so that developers make the data in a format that can be transmitted to another developer/machine.
  3. Unstructured data: This type of data is found as photos, audio, log files, and videos that cannot be defined in a proper format. The metadata associated with them is considered semi-structured but as a whole data is unstructured.

Types of serialization Language formats

  1. XML
  2. JSON
  3. YAML

XML

The full form of XML is Extensible Markup Language Which is widely used in many applications around the world. The format of the XML file used tags in angle brackets. This language is originally used to fast update dynamic content for dynamic web pages. In the modern world, it uses also the serialization language method but due to large verbose content, its speed is slow as compared to others serialization formats. More verbose/words mean the need for more storage area Which is another disadvantage of this format.

Example of the XML format

# the structure of the language is a tree structure
<employee ID="12">
<FirstName>Amit</FirstName>
<LastName>Chauhan</LastName>
<Department>
<Deparment_name>Accounts</Deparment_name>
</Department>
</employee>

JSON

The full form of JSON is JavaScript Object Notation Which is most frequently used in web services and also as network automation. This serialization format is fast and human-readable. The format is less verbose i.e. take less memory to store data. The data expression is more likely as key-value pair with curly braces to identify its data structure.

Example of JSON format

# the structure of the language is a map structure
{
"employeeID": "12",
"FirstName": "Amit",
"LastName": "Chauhan",
"Department": [
{"Deparment_name": "Accounts"}
]
}

YAML

The full form of YAML is YAML Ain’t Markup Language that is recently developed as a data serialization language. This language is more readable and less verbose than JSON. It uses a colon, line, and indentation to define the data structure as compared to other languages that use commas, and parentheses. This file format is mostly used as a configuration file that the computer parsed to configure the machine or other updates.

Example of the YAML format

# the structure of the language is a map structure

employeeID: 12,
FirstName: Amit,
LastName: Chauhan,
Department:
Deparment_name: Accounts

We can see that the Department_name is indented to its parent Department.

Different programming languages that use serialization language

  1. Python: In python, the serialization is done by a pickle module and can be de-serialized with only python based language only.
  2. Java: It handles the serialization internally to those objects/classes that are marked as java.io.Serializable.
  3. Javascript: It uses the internal built-in method JSON.parse() and others.
  4. .Net: It uses three serialization methods i.e. JSON, Binary, and XML.
  5. PHP: This language uses built-in methods for serialization and de-serialization i.e. serialize() and deserialize().

Final thought

The file formats and data types are very important to know and use in different operations to reduce the latency in the application.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: