Hello,
Do you have any guidance on how to read large files of size around 150 GB?
I am trying to follow this, any other advice?
https://www.ovito.org/manual/python/introduction/advanced_topics.html#using-ovito-with-python-s-multiprocessing-module
How many cores should be used for reading the file?
Thank you
Reading/parsing the file is a single-core CPU operation. The question is what comes after that? What do you want to do with the dataset after it is loaded? For sure you’ll need a machine with a lot of RAM to process such a dataset, and not all functions of OVITO run multi-threaded or support billions of particles.
I need OVITO to read the dump files and extract the ID, position to be processed later.
If OVITO reading is a single-core operation, then is it faster to process it via bash?
I’m not entirely sure what you’re trying to achieve. Could you clarify your goal? Is this question about maximizing I/O performance or minimizing your own effort?
If your aim is to extract particle IDs and positions from a large dump file, you could simply use the OVITO Python module for loading the file and accessing the data as NumPy views. Here’s a basic example:
from ovito.io import import_file
data = import_file('file.dump').compute()
positions = data.particles.positions[...]
ids = data.particles.identifiers[...]
I’m also not sure what you meant by “process it via bash”. Bash is a shell, not an actual program for processing data files. Maybe you had the awk
or sed
commands in mind?