Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pre-generated prompts option for benchmark #1091

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

omer-demir
Copy link

During benchmarking, we wanted to have pre-generated prompts that have been prepared for better benchmark result. Hence, It can be handy during benchmarking. In our test, we wanted to focus only token generation and sampling on SLM.

@omer-demir omer-demir force-pushed the omerdemir/pre_generated_prompts branch from f8d1bdc to 0b19f52 Compare November 25, 2024 08:59
@omer-demir omer-demir force-pushed the omerdemir/pre_generated_prompts branch from 0b19f52 to 243e7ea Compare December 20, 2024 19:00
@@ -232,6 +240,9 @@ def run_benchmark(args, batch_size, prompt_length, generation_length, max_length
# use random tokens instead of generating a prompt using the model and then tokenizing it
tokens = np.random.randint(100, size=(batch_size, prompt_length))
prompt = [tokenizer.decode(tokens[0])] * batch_size
elif args.use_prompt_set:
prompt = get_prompt_by_length(prompt_length)
tokens = tokenizer.encode_batch(prompt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different tokenizers can encode prompts into different prompt lengths. Some additional work is needed to get the desired prompt length. You can see an example of how to do this here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so basically, we will check the tokens length against requested prompt_length and we will add/trim if it is needed. Is that correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct.

@@ -83,6 +83,14 @@ def generate_prompt(model, tokenizer, prompt_length, use_graph_capture) -> str:
generator.generate_next_token()
return tokenizer.decode(generator.get_sequence(0))

# Use prompt length to get pre-defined prompt
def get_prompt_by_length(prompt_length):
json_path = "prompts.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of uploading another copy of prompts.json, can we download it from here and save it to disk using requests or urllib instead? That way, only one location has to be updated when adding other prompts.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to rely on external file? If that changes, we may have problem with this. Besides, in the benchmark environment, it seems logical to use local files instead of relying on internet connectivity. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If connectivity is a concern in your benchmark environment, then I think it's fine to include a copy of the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants