Pytorch RNN memory allocation error in DataLoader

asinelni Source

I am writing an RNN in Pytorch. I have the following line of code:

data_loader = torch.utils.data.DataLoader(
    data,
    batch_size=args.batch_size,
    shuffle=True,
    num_workers=args.num_workers,
    drop_last=True)

If I set num_workers to 0, I get a segmentation fault. If I set num_workers to > 0, then I have the traceback:

Traceback (most recent call last):
File "rnn_model.py", line 352, in <module>
train_model(train_data, dev_data, test_data, model, args)
File "rnn_model.py", line 212, in train_model
loss = run_epoch(train_data, True, model, optimizer, args)
File "rnn_model.py", line 301, in run_epoch
for batch in tqdm.tqdm(data_loader):
File "/home/username/miniconda3/lib/python2.7/site-packages/tqdm/_tqdm.py", 
line 872, in __iter__
for obj in iterable:
File "/home/username/miniconda3/lib/python2.7/site-
packages/torch/utils/data/dataloader.py", line 303, in __iter__
return DataLoaderIter(self)
File "/home/username/miniconda3/lib/python2.7/site-
packages/torch/utils/data/dataloader.py", line 162, in __init__
w.start()
File "/home/username/miniconda3/lib/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/home/username/miniconda3/lib/python2.7/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
pytorch

Answers

answered 8 months ago ginge #1

You are trying to load more data than your system can hold in its RAM. You can either try to load only parts of your data or use/write a data loader which only loads the data that is needed for the current batch.

answered 6 months ago Mo Hossny #2

My guess is, whatever values batch size and number of workers coming through the args are casted or misinterpreted.

Please print them out and make sure you got the values you passed.

comments powered by Disqus