copyleft() {
cat<<-EOF
traintest:: generate a train/test set for N-way cross validation
Copyright (C) 2004 Tim Menzies
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation, version 2.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
EOF
}
Don't test on what you train on. Seperate available data into a test set and a train set
usage() {
cat<<-EOF
Usage: testtrain [FLAGs] ArffFileStem
Split $data/ArffFileStem,arff into train.arff and test.arff
Flags:
-f NUM number of folds; default=$Q$folds0$Q
-n NUM the nth fold to extract; default=$Q$fold0$Q
-t STEM stem of the training set; default=$Q$train0$Q
-T STEM stem of the test set; default=$Q$test0$Q
-h print this help text
-l copyright notice
-x run an example
EOF
exit
}
sirius:~/public_html/dm [76]$ traintest -x
---| Train |------------
@relation weather-weka.filters.SplitDatasetFilter-S0-V-N10-F1
@attribute outlook {sunny,overcast,rainy}
@attribute temperature numeric
@attribute humidity numeric
@attribute windy {TRUE,FALSE}
@attribute play {yes,no}
@data
overcast,83,86,FALSE,yes
rainy,70,96,FALSE,yes
rainy,68,80,FALSE,yes
rainy,65,70,TRUE,no
overcast,64,65,TRUE,yes
sunny,72,95,FALSE,no
sunny,69,70,FALSE,yes
rainy,75,80,FALSE,yes
sunny,75,70,TRUE,yes
overcast,72,90,TRUE,yes
overcast,81,75,FALSE,yes
rainy,71,91,TRUE,no
---| Test |------------
@relation weather-weka.filters.SplitDatasetFilter-S0-N10-F1
@attribute outlook {sunny,overcast,rainy}
@attribute temperature numeric
@attribute humidity numeric
@attribute windy {TRUE,FALSE}
@attribute play {yes,no}
@data
sunny,85,85,FALSE,no
sunny,80,90,TRUE,no
First, if you have installed anything from this site before, save your config file to somewhere safe.
Second, copy the following files to your directory (from
either ~timm/public_html/dm or
http://www.cs.pdx.edu/~timm/dm or
from http://www.cs.pdx.edu/~timm/dm/traintest.zip):
config, and traintest.
=back
Third, make traintest executable:
chmod +x traintest
Fourth, compare your safe version of config
with the new version you just copied and fix up any paths.
Five, edit this file and config.
The first line of this file should point to your local bash shell.
and you'll need to check at least the #paths sectionin config
Check that all it works:
traintest -x
If the installation worked, then you should see two arff files printer- a smaller training set and a larger test set.
Defaults:
folds0="10" fold0="1" train0="train" test0="test"
Paths:
. config
Minor details:
Q="\""
traintestDemo() {
main weather
echo ""
echo "---| Train |------------"
cat $train.arff
echo ""
echo "---| Test |------------"
cat $test.arff
}
main() {
$java -Xmx1024M -cp $wekajar $nways \
-N $folds -F $fold -i $data/$1.arff \
-o $test.arff
$java -Xmx1024M -cp $wekajar $nways \
-V -N $folds -F $fold -i $data/$1.arff \
-o $train.arff
}
demo=""
while getopts "f:hln:t:T:x" flag
do case "$flag" in
f) folds=$OPTARG;;
h) usage; exit ;;
l) copyleft; exit;;
n) fold=$OPTARG;;
t) train=$OPTARG;;
T) test=$OPTARG;;
x) demo="traintestDemo";;
esac
done
shift $(($OPTIND - 1))
folds=${folds:=$folds0}
fold=${fold:=$fold0}
n=${n:=$n0}
train=${train:=$train0}
test=${test:=$test0}
if [ -n "$demo" ]
then $demo
exit
else main $1
fi
Tim Menzies ,
tim@menzies.us,
http://menzies.us
This page generated by Site:
see http://www.cs.pdx.edu/~timm/dm/site.html
This site is built using PerlPod.Style sheet switching method taken from Eddie Traversa's excellent and simple-to-apply tutorial: http://dhtmlnirvana.com/content/styleswitch/styleswitch1.html.
Search engine powered by ATOMZ http://www.atomz.com/search/. Note, the indexes to this site are only updated weekly (heh, its a free service- what more ja want?).
Icons on this site come from http://www.sql-news.de/rubriken/olap.asp and http://www.ifnet.it/webif/centrodi/eng/toolbar.htm.
The JAVA machine learners used at this site come from the extensive data mining libraries found in the University of Waikato's Environment for Knowledge Analysis (the WEKA) http://www.cs.waikato.ac.nz/ml/weka/
Copyright (C) Tim Menzies 2004
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; see http://www.gnu.org/copyleft/gpl.html. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
The content from or through this web page are provided 'as is' and the author makes no warranties or representations regarding the accuracy or completeness of the information. Your use of this web page and information is at your own risk. You assume full responsibility and risk of loss resulting from the use of this web page or information. If your use of materials from this page results in the need for servicing, repair or correction of equipment, you assume any costs thereof. Follow all external links at your own risk and liability.