Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练模型报错:PermissionError: [WinError 32] 另一个程序正在使用此文件,进程无法访问。: 'E:/workspace/clone-voice/models/tts/run/training/GPT_XTTS_FT-June-27-2024_05+20PM-0000000\\trainer_0_log.txt' #121

Open
3293406747 opened this issue Jun 27, 2024 · 3 comments

Comments

@3293406747
Copy link

3293406747 commented Jun 27, 2024

proxy='http://127.0.0.1:7890'
Running on local URL:  http://0.0.0.0:5003
2024-06-27 17:31:19,914 [INFO] HTTP Request: GET http://localhost:5003/startup-events "HTTP/1.1 502 Bad Gateway"
2024-06-27 17:31:20,480 [INFO] HTTP Request: HEAD http://localhost:5003/ "HTTP/1.1 502 Bad Gateway"
2024-06-27 17:31:20,711 [INFO] HTTP Request: GET https://checkip.amazonaws.com/ "HTTP/1.1 200 "
2024-06-27 17:31:20,955 [INFO] HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
IMPORTANT: You are using gradio version 4.19.2, however version 4.29.0 is available, please upgrade.
--------
2024-06-27 17:31:21,481 [INFO] HTTP Request: HEAD http://localhost:5003/ "HTTP/1.1 502 Bad Gateway"
2024-06-27 17:31:22,191 [INFO] HTTP Request: HEAD http://localhost:5003/ "HTTP/1.1 502 Bad Gateway"
2024-06-27 17:31:23,016 [INFO] HTTP Request: POST https://api.gradio.app/gradio-initiated-analytics/ "HTTP/1.1 200 OK"
2024-06-27 17:31:23,178 [INFO] HTTP Request: HEAD http://localhost:5003/ "HTTP/1.1 502 Bad Gateway"
2024-06-27 17:31:24,218 [INFO] HTTP Request: HEAD http://localhost:5003/ "HTTP/1.1 502 Bad Gateway"
2024-06-27 17:31:27,565 [INFO] HTTP Request: GET https://api.gradio.app/v2/tunnel-request "HTTP/1.1 200 OK"
Running on public URL: https://2ab484acf533a8b95a.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Exception in thread Thread-6 (_do_normal_analytics_request):
Traceback (most recent call last):
  File "E:\workspace\clone-voice\venv\httpx\_transports\default.py", line 69, in map_httpcore_exceptions
    yield
  File "E:\workspace\clone-voice\venv\httpx\_transports\default.py", line 233, in handle_request
    resp = self._pool.handle_request(req)
  File "E:\workspace\clone-voice\venv\httpcore\_sync\connection_pool.py", line 216, in handle_request
    raise exc from None
  File "E:\workspace\clone-voice\venv\httpcore\_sync\connection_pool.py", line 196, in handle_request
    response = connection.handle_request(
  File "E:\workspace\clone-voice\venv\httpcore\_sync\http_proxy.py", line 317, in handle_request
    stream = stream.start_tls(**kwargs)
  File "E:\workspace\clone-voice\venv\httpcore\_sync\http11.py", line 383, in start_tls
    return self._stream.start_tls(ssl_context, server_hostname, timeout)
  File "E:\workspace\clone-voice\venv\httpcore\_backends\sync.py", line 152, in start_tls
    with map_exceptions(exc_map):
  File "D:\miniconda3\lib\contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "E:\workspace\clone-voice\venv\httpcore\_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectTimeout: _ssl.c:990: The handshake operation timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\miniconda3\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "D:\miniconda3\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "E:\workspace\clone-voice\venv\gradio\analytics.py", line 63, in _do_normal_analytics_request
    httpx.post(url, data=data, timeout=5)
  File "E:\workspace\clone-voice\venv\httpx\_api.py", line 319, in post
    return request(
  File "E:\workspace\clone-voice\venv\httpx\_api.py", line 106, in request
    return client.request(
  File "E:\workspace\clone-voice\venv\httpx\_client.py", line 827, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "E:\workspace\clone-voice\venv\httpx\_client.py", line 914, in send
    response = self._send_handling_auth(
  File "E:\workspace\clone-voice\venv\httpx\_client.py", line 942, in _send_handling_auth
    response = self._send_handling_redirects(
  File "E:\workspace\clone-voice\venv\httpx\_client.py", line 979, in _send_handling_redirects
    response = self._send_single_request(request)
  File "E:\workspace\clone-voice\venv\httpx\_client.py", line 1015, in _send_single_request
    response = transport.handle_request(request)
  File "E:\workspace\clone-voice\venv\httpx\_transports\default.py", line 232, in handle_request
    with map_httpcore_exceptions():
  File "D:\miniconda3\lib\contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "E:\workspace\clone-voice\venv\httpx\_transports\default.py", line 86, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectTimeout: _ssl.c:990: The handshake operation timed out
2024-06-27 17:31:36,141 [ERROR] Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
Traceback (most recent call last):
  File "D:\miniconda3\lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "D:\miniconda3\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost
    self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。
Loading Whisper Model!
2024-06-27 17:31:59,734 [INFO] Processing audio with duration 02:01.217
2024-06-27 17:32:00,594 [INFO] VAD filter removed 00:07.441 of audio
数据处理完毕,开始训练!
trainfile='E:\\workspace\\clone-voice\\models\\tts\\dataset502508\\metadata_train.csv'
evalfile='E:\\workspace\\clone-voice\\models\\tts\\dataset502508\\metadata_eval.csv'
>> DVAE weights restored from: E:\workspace\clone-voice\models\tts\run\training\XTTS_v2.0_original_model_files/dvae.pth
 | > Found 11 files in E:\workspace\clone-voice\models\tts\dataset502508
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
 > Training Environment:
 | > Backend: Torch
 | > Mixed precision: False
 | > Precision: float32
 | > Current device: 0
 | > Num. of GPUs: 1
 | > Num. of CPUs: 12
 | > Num. of Torch Threads: 1
 | > Torch seed: 1
 | > Torch CUDNN: True
 | > Torch CUDNN deterministic: False
 | > Torch CUDNN benchmark: False
 | > Torch TF32 MatMul: False
 > Start Tensorboard: tensorboard --logdir=E:\workspace\clone-voice\models\tts\run\training\GPT_XTTS_FT-June-27-2024_05+33PM-0000000

 > Model has 518442047 parameters
 > Sampling by language: dict_keys(['zh'])

 > EPOCH: 0/15
 --> E:\workspace\clone-voice\models\tts\run\training\GPT_XTTS_FT-June-27-2024_05+33PM-0000000

 > TRAINING (2024-06-27 17:33:49) 
proxy='http://127.0.0.1:7890'
proxy='http://127.0.0.1:7890'
proxy='http://127.0.0.1:7890'
proxy='http://127.0.0.1:7890'
proxy='http://127.0.0.1:7890'
proxy='http://127.0.0.1:7890'
proxy='http://127.0.0.1:7890'
proxy='http://127.0.0.1:7890'

   --> TIME: 2024-06-27 17:36:01 -- STEP: 0/3 -- GLOBAL_STEP: 0
     | > loss_text_ce: 0.04092761129140854  (0.04092761129140854)
     | > loss_mel_ce: 4.284645080566406  (4.284645080566406)
     | > loss: 4.325572490692139  (4.325572490692139)
     | > grad_norm: 0  (0)
     | > current_lr: 5e-06 
     | > step_time: 30.104  (30.103973865509033)
     | > loader_time: 101.4882  (101.48818254470825)

Traceback (most recent call last):
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1833, in fit
    self._fit()
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1785, in _fit
    self.train_epoch()
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1504, in train_epoch
    outputs, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time)
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1360, in train_step
    outputs, loss_dict_new, step_time = self.optimize(
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1226, in optimize
    outputs, loss_dict = self._compute_loss(
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1157, in _compute_loss
    outputs, loss_dict = self._model_train_step(batch, model, criterion)
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1116, in _model_train_step
    return model.train_step(*input_args)
  File "E:\workspace\clone-voice\venv\TTS\tts\layers\xtts\trainer\gpt_trainer.py", line 308, in train_step
    loss_text, loss_mel, _ = self.forward(
  File "E:\workspace\clone-voice\venv\TTS\tts\layers\xtts\trainer\gpt_trainer.py", line 215, in forward
    losses = self.xtts.gpt(
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\TTS\tts\layers\xtts\gpt.py", line 511, in forward
    text_logits, mel_logits = self.get_logits(
  File "E:\workspace\clone-voice\venv\TTS\tts\layers\xtts\gpt.py", line 279, in get_logits
    gpt_out = self.gpt(
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\transformers\models\gpt2\modeling_gpt2.py", line 888, in forward
    outputs = block(
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\transformers\models\gpt2\modeling_gpt2.py", line 390, in forward
    attn_outputs = self.attn(
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\transformers\models\gpt2\modeling_gpt2.py", line 331, in forward
    attn_output, attn_weights = self._attn(query, key, value, attention_mask, head_mask)
  File "E:\workspace\clone-voice\venv\transformers\models\gpt2\modeling_gpt2.py", line 183, in _attn
    attn_weights = torch.matmul(query, key.transpose(-1, -2))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 4.00 GiB of which 0 bytes is free. Of the allocated memory 10.26 GiB is allocated by PyTorch, and 537.42 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:\workspace\clone-voice\train.py", line 300, in train_model
    config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(language, args['num_epochs'], args['batch_size'], args['grad_acumm'], trainfile, evalfile, output_path=args['out_path'], max_audio_length=max_audio_length)
  File "E:\workspace\clone-voice\venv\TTS\demos\xtts_ft_demo\utils\gpt_train.py", line 159, in train_gpt
    trainer.fit()
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1860, in fit
    remove_experiment_folder(self.output_path)
  File "E:\workspace\clone-voice\venv\trainer\generic_utils.py", line 77, in remove_experiment_folder
    fs.rm(experiment_path, recursive=True)
  File "E:\workspace\clone-voice\venv\fsspec\implementations\local.py", line 168, in rm
    shutil.rmtree(p)
  File "D:\miniconda3\lib\shutil.py", line 750, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "D:\miniconda3\lib\shutil.py", line 620, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "D:\miniconda3\lib\shutil.py", line 618, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 32] 另一个程序正在使用此文件,进程无法访问。: 'E:/workspace/clone-voice/models/tts/run/training/GPT_XTTS_FT-June-27-2024_05+33PM-0000000\\trainer_0_log.txt'
Traceback (most recent call last):
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1833, in fit
    self._fit()
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1785, in _fit
    self.train_epoch()
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1504, in train_epoch
    outputs, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time)
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1360, in train_step
    outputs, loss_dict_new, step_time = self.optimize(
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1226, in optimize
    outputs, loss_dict = self._compute_loss(
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1157, in _compute_loss
    outputs, loss_dict = self._model_train_step(batch, model, criterion)
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1116, in _model_train_step
    return model.train_step(*input_args)
  File "E:\workspace\clone-voice\venv\TTS\tts\layers\xtts\trainer\gpt_trainer.py", line 308, in train_step
    loss_text, loss_mel, _ = self.forward(
  File "E:\workspace\clone-voice\venv\TTS\tts\layers\xtts\trainer\gpt_trainer.py", line 215, in forward
    losses = self.xtts.gpt(
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\TTS\tts\layers\xtts\gpt.py", line 511, in forward
    text_logits, mel_logits = self.get_logits(
  File "E:\workspace\clone-voice\venv\TTS\tts\layers\xtts\gpt.py", line 279, in get_logits
    gpt_out = self.gpt(
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\transformers\models\gpt2\modeling_gpt2.py", line 888, in forward
    outputs = block(
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\transformers\models\gpt2\modeling_gpt2.py", line 390, in forward
    attn_outputs = self.attn(
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\workspace\clone-voice\venv\transformers\models\gpt2\modeling_gpt2.py", line 331, in forward
    attn_output, attn_weights = self._attn(query, key, value, attention_mask, head_mask)
  File "E:\workspace\clone-voice\venv\transformers\models\gpt2\modeling_gpt2.py", line 183, in _attn
    attn_weights = torch.matmul(query, key.transpose(-1, -2))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 4.00 GiB of which 0 bytes is free. Of the allocated memory 10.26 GiB is allocated by PyTorch, and 537.42 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:\workspace\clone-voice\train.py", line 300, in train_model
    config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(language, args['num_epochs'], args['batch_size'], args['grad_acumm'], trainfile, evalfile, output_path=args['out_path'], max_audio_length=max_audio_length)
  File "E:\workspace\clone-voice\venv\TTS\demos\xtts_ft_demo\utils\gpt_train.py", line 159, in train_gpt
    trainer.fit()
  File "E:\workspace\clone-voice\venv\trainer\trainer.py", line 1860, in fit
    remove_experiment_folder(self.output_path)
  File "E:\workspace\clone-voice\venv\trainer\generic_utils.py", line 77, in remove_experiment_folder
    fs.rm(experiment_path, recursive=True)
  File "E:\workspace\clone-voice\venv\fsspec\implementations\local.py", line 168, in rm
    shutil.rmtree(p)
  File "D:\miniconda3\lib\shutil.py", line 750, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "D:\miniconda3\lib\shutil.py", line 620, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "D:\miniconda3\lib\shutil.py", line 618, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 32] 另一个程序正在使用此文件,进程无法访问。: 'E:/workspace/clone-voice/models/tts/run/training/GPT_XTTS_FT-June-27-2024_05+33PM-0000000\\trainer_0_log.txt'

image

@jianchang512
Copy link
Owner

删掉下载一半的模型,重新下载模型

@3293406747
Copy link
Author

删掉下载一半的模型,重新下载模型

是指这几个文件吗,之前删除试过没有效果,仍然报同样的错误。

image

@jianchang512
Copy link
Owner

挂个稳定的梯子试试

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants