Among many other programming languages, YAML is yet another one that can be used to write configuration files. Depending on who you ask, this is another markup language or not which has its emphasizes that it is only for data and not documents. This tutorial covers load, write, edit, dump and read guide for YAML in Python using PyYAML framework.
Keep in mind that it is not at all easy to write and read for humans and non-programmers. Yet it is very easy to parse it, especially with Python and PyYAML library. The best thing is that it is human friendly and one of the biggest advantages is readability. It has other formats too which include JSON and XML.
9 Quick YAML Features
YAML markup is not just another markup! It is actually useful, and here are 9 points why:
- It has no executable commands.
- Carries explicit data types along with tags.
- Precise feedback.
- The syntax is clean.
- It provides support for complex structures.
- Very easy to parse for computers.
- Easily readable for humans.
- One file can be stored in multiple documents but that too with a separator, this is a feature that is used in Kubernetes definitions.
- Comments can also be used in YAML files.
What is Python PyYAML parser and why you should use it
You shall see many scenarios in which load() is being used in spite of safe_load(). We did not however mention this before as many of you have jobs and quickly copy and paste some example code.
Now let’s discuss the difference, if we talk about load() then just like pickle it is a powerful function but both are insecure ways as the theft is allowed to execute the arbitrary code. PyYAML load function allows you to both serialize and deserialize complete objects of Python and even execute code which includes calls to os.system library, these are able to execute any command on the system.
You shall notice that load() function is deprecated in its recent version. This shall create a big issue like a warning when using it in an insecure way.
In case you are parsing regular files just like most users then make sure to use safe_load() as it carries a subset of the load function.
We think the below 19 points show why YAML is used and preferred over other markup languages.
- It is both expressive and extensible.
- It provides support to an extensible set of types for scalar values of data typed.
- Provides support to one-direction processing or one-pass.
- To support generic tools it carries a consistent data model.
- Carries a strict syntax.
- Very easy to use and implement.
- Comes with powerful tools like PyYAML.
- Implementations are fast and secure.
- The syntax is clean and simple.
- Simple to learn and read.
- Able to express a huge variety of different native data structures as well as allow for custom extensions.
- It has portability across most programming languages.
- It can be read easily by humans.
- With the help of this, you can represent complex data structures that too in a format that can be read easily by humans.
- It is unambiguous.
- It provides support to representing sequence as lists and mapping as dictionaries in a manner that is independent of language.
- It is a superset of JSON which means that all documents which are valid of JSON are valid on YAML as well.
Where YAML doesn’t fit in with Python
If you want to configure files then this is the best choice. This is how we and many developers use it. Its syntax is rich when compared to other alternates, .ini files, but still, it is great on the eyes and quite simple if you intend to write and parse.
However, there are some downsides to this which are as followed:
- For simple use cases such as data exchange of simple objects, it is deemed to be versatile.
- It has a dependency on indentation which creates frustration.
- It is not a part of the standard Python library whereas XML and JSON are.
Install PyYAML Framework in Python
Its data can be parsed by different python packages however the most prevalent is YAML.
Now, for those who don’t know it, PyYAML is not part of the standard python library which means it needs to be installed with pip. Use need to use the following command:
[email protected]:~# install pyyaml
For use in scripts, a simple import module as followed
You don’t import ‘pyyaml’ but ‘yaml’ simply. This triggers the markup.
[email protected]:~# Import yaml
Read and Parse YAML
Once it has been imported you can now load YAML file and then parse it. Often it carries the extension .yaml or .yml.
Parse YAML Strings
To parse all types of valid YAML strings what you can do is simply use yml.safe_load()
With this, you are able to define different documents in one file simple but they need to be separated with a triple dash (—). It shall happily parse these files as well and then return a list of docs. Use yaml.safe_load_all() function and do anything. Moreover, it can also be used to return a generator that in return shall return back all the documents one by one.
Keep in mind that when you are reading documents from this, files need to be opened, so you can process them within the clause.
Dump and Write
Though many of us shall read YAML only as configuration files, it can be quite handy to write YAML too.
You can use the below to dump:
[email protected]:~# ruamel.yaml
Change YAML to JSON
For this, you simply need to parse YAML and then use the JSON module to convert the object to JSON.
Other Python tutorials that are information:
- How to Delete and Create Python venv environment.
- 10 Useful venv commands for Python programmers.
- How To Install NumPy in Python on Windows to use with Visual Studio.
In this article, you learned how to read, write, load edit and dump YAML markup in Python with the help of PyYAML. YAML is extremely useful for having custom configuration files which you will come across a lot in automation and scripting. Do let us know how you have utilized YAML in your environment or project.