Skip to content

Commit

Permalink
Fix channels for s3
Browse files Browse the repository at this point in the history
fixed blocking code for s3 channels when dataset size greater then 64
  • Loading branch information
Rexwang8 authored Oct 22, 2023
1 parent 1bca3fc commit 7994a99
Showing 1 changed file with 10 additions and 8 deletions.
18 changes: 10 additions & 8 deletions cmd/dataset_tokenizer/dataset_tokenizer.go
Original file line number Diff line number Diff line change
Expand Up @@ -416,17 +416,19 @@ func ReadTextsFromS3(
go startReader()
}

// List objects recursively.
getObjectsS3Recursively(svc, bucketName, "", objects)
go func() {
// List objects recursively.
getObjectsS3Recursively(svc, bucketName, "", objects)

// Close the objects channel when done.
close(objects)
// Close the objects channel when done.
close(objects)

// Wait for all reader goroutines to finish.
wg.Wait()
// Wait for all reader goroutines to finish.
wg.Wait()

// Close the runeReaders channel.
close(runeReaders)
// Close the runeReaders channel.
close(runeReaders)
}()

return runeReaders, nil
}
Expand Down

0 comments on commit 7994a99

Please sign in to comment.