Update README.md

0da11a2 almost 3 years ago

4.2 kB

	---
	license: bigcode-openrail-m
	---
	Note : The adapter and related GLaDOS code is licensed under Apache 2.0- however the base model is licensed under bigcode-openrail-m. Since this adapter utilizes the base model, you still must adhere to the openrail license.
	As such I have marked openrail as the license for this model, since it _effectively_ is.


	GLaDOS speaks Markdown!

	Usage

	To use this model, you must first navigate to the bigcode starcoder model and accept their license, then create a token for your account and update the below code with it.
	```
	import torch
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Setup Model
	path = "JamesConley/glados_starcoder"
	token = PUT YOUR TOKEN HERE!!!!!!!
	config = PeftConfig.from_pretrained(path)
	base_model_path = config.base_model_name_or_path
	model = AutoModelForCausalLM.from_pretrained(base_model_path, torch_dtype=torch.float16, device_map="auto", use_auth_token=token)
	model = PeftModel.from_pretrained(model, path, device_map="auto")

	# Setup Tokenizer
	tokenizer = AutoTokenizer.from_pretrained(base_model_path, truncation_side="left", use_auth_token=token)

	# Encode Tokens
	text = """User :\nWrite a python function that trains a classifier on data loaded from a csv.\nGLaDOS :\n"""
	input_ids = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048).input_ids

	# Move tokens to GPU
	#input_ids = input_ids.to("cuda")

	# Perform Inference
	with torch.no_grad():
	with torch.cuda.amp.autocast():
	gen_tokens = model.generate(
	input_ids=input_ids, max_new_tokens=256
	)

	# Decode Tokens
	gen_text = tokenizer.batch_decode(gen_tokens)
	```

	Result
	Note : The slash \ characters are not actually in the model output, I have added them to prevent the model card from rendering the codeblock incorrectly.
	```
	User :
	Write a python function that trains a classifier on data loaded from a csv.
	GLaDOS :
	Here is an example of a function that trains a classifier on data loaded from a CSV file:

	\```
	python
	import pandas as pd
	from sklearn.model_selection import train_test_split
	from sklearn.ensemble import RandomForestClassifier

	def train_classifier(data_file, target_column, test_size=0.2):
	# Load the data from the CSV file
	data = pd.read_csv(data_file)

	# Split the data into training and testing sets
	X_train, X_test, y_train, y_test = train_test_split(data.drop(target_column, axis=1),
	data[target_column],
	test_size=test_size)

	# Train the classifier
	clf = RandomForestClassifier()
	clf.fit(X_train, y_train)

	# Return the trained classifier and the test set predictions
	return clf, clf.predict(X_test)

	\```
	This function takes in the following arguments:

	* `data_file`: the path to the CSV file containing the data
	* `target_column`: the name of the column in the CSV file that contains the target variable
	```

	Markdown rendered output :


	User :
	Write a python function that trains a classifier on data loaded from a csv.
	GLaDOS :
	Here is an example of a function that trains a classifier on data loaded from a CSV file:

	```
	python
	import pandas as pd
	from sklearn.model_selection import train_test_split
	from sklearn.ensemble import RandomForestClassifier

	def train_classifier(data_file, target_column, test_size=0.2):
	# Load the data from the CSV file
	data = pd.read_csv(data_file)

	# Split the data into training and testing sets
	X_train, X_test, y_train, y_test = train_test_split(data.drop(target_column, axis=1),
	data[target_column],
	test_size=test_size)

	# Train the classifier
	clf = RandomForestClassifier()
	clf.fit(X_train, y_train)

	# Return the trained classifier and the test set predictions
	return clf, clf.predict(X_test)

	```
	This function takes in the following arguments:

	* `data_file`: the path to the CSV file containing the data
	* `target_column`: the name of the column in the CSV file that contains the target variable