Octave 入门

Notes of Andrew Ng’s Machine Learning —— (5) Octave Tutorial

Basic Operations

Elementary Operations

+, -, *, /, ^.

1
2
3
4
5
6
7
8
9
10
>> 5 + 6
ans = 11
>> 20 - 1
ans = 19
>> 3 * 4
ans = 12
>> 8 / 2
ans = 4
>> 2 ^ 8
ans = 256

Logical Operations

==, ~=, &&, ||, xor().

Note that a not equal sign is ~=, and not !=.

1
2
3
4
5
6
7
8
9
10
>> 1 == 0
ans = 0
>> 1 ~= 0
ans = 1
>> 1 && 0
ans = 0
>> 1 || 0
ans = 1
>> xor(1, 0)
ans = 1

Change the Prompt

We can change the prompt via PS1():

1
2
3
4
5
6
>> PS1("octave: > ")
octave: > PS1(">> ")
>> PS1("octave: > ")
octave: > PS1("SOMETHING > ")
SOMETHING > PS1(">> ")
>> % Prompt changed

Variables

1
2
3
4
5
6
>> a = 3
a = 3
>> a = 3; % semicolon supressing output
>> c = (3 >= 1);
>> c
c = 1

Display variables

1
2
3
4
5
6
7
>> a = pi;
>> a
a = 3.1416
>> disp(a)
3.1416
>> disp(sprintf('2 decimals: %0.2f', a))
2 decimals: 3.14

We can also set the default length of decimal places by entering format short/long:

1
2
3
4
5
6
7
8
>> a
a = 3.1416
>> format long
>> a
a = 3.141592653589793
>> format short
>> a
a = 3.1416

Create Matrices

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
>> A = [1, 2, 3; 4, 5, 6]
A =

1 2 3
4 5 6

>> B = [1 3 5; 7 9 11]
B =

1 3 5
7 9 11

>> B = [1, 2, 3;
> 4, 5, 6;
> 7, 8, 9]
B =

1 2 3
4 5 6
7 8 9

>> C = [1, 2, 4, 8]
C =

1 2 4 8

>> D = [1; 2; 3; 4]
D =

1
2
3
4

There are some useful methods to generate matrices:

  • Generate vector of a range
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>> v = 1:10    % start:end
v =

1 2 3 4 5 6 7 8 9 10

>> v = 1:0.1:2 % start:step:end
v =

Columns 1 through 8:

1.0000 1.1000 1.2000 1.3000 1.4000 1.5000 1.6000 1.7000

Columns 9 through 11:

1.8000 1.9000
  • Generate matrices of all ones/zeros
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
>> ones(2, 3)
ans =

1 1 1
1 1 1

>> zeros(3, 2)
ans =

0 0
0 0
0 0

>> C = 2 * ones(4, 5)
C =

2 2 2 2 2
2 2 2 2 2
2 2 2 2 2
2 2 2 2 2

  • Generate identity matrices
1
2
3
4
5
6
7
8
9
>> eye(3)
ans =

Diagonal Matrix

1 0 0
0 1 0
0 0 1

  • Generate matrices of random values

Uniform distribution between 0 and 1:

1
2
3
4
5
>> D = rand(1, 3)
D =

0.14117 0.81424 0.83745

Gaussian random:

1
2
3
4
5
6
>> D = randn(1, 3)
D =

0.22133 -2.00002 1.61025


We can generate a gaussian random vector with 10000 elements, and plot a histogram:

1
2
>> randn(1, 10000);
>> hist(w)

Output figure:

image-20190907224934733

We can also plot a histogram with more buckets, 50 bins for example:

1
>> hist(w, 50)

image-20190907224859815

Get Help

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
>> help

For help with individual commands and functions type

help NAME

......

>> help eye
'eye' is a built-in function from the file libinterp/corefcn/data.cc

......

>> help help

......

Moving Data Around

Size of matrix

size(): get the size of a matrix, return [rows, columns].

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
>> A = [1, 2; 3, 4; 5, 6]
A =

1 2
3 4
5 6

>> size(A) % get the size of A
ans =

3 2

>> sz = size(A); % actually, size return a 1x2 matrix
>> size(sz)
ans =

1 2

>> size(A, 1) % get the first dimension of A (i.e. the number of rows)
ans = 3
>> size(A, 2) % the number of columns
ans = 2

length(): return the size of the longest dimension.

1
2
3
4
5
>> length(A)    % get the size of the longest dimension. Confusing, not recommend
ans = 3
>> v = [1, 2, 3, 4];
>> length(v) % We often length() to get the length of a vector
ans = 4

Load data

We can use basic shell commands to find data that we want.

1
2
3
4
5
6
7
8
9
10
11
>> pwd
ans = /Users/c
>> cd MyProg/octave/
>> pwd
ans = /Users/c/MyProg/octave
>> ls
featureX.dat featureY.dat
>> ls -l
total 16
-rw-r--r-- 1 c staff 188 Sep 8 10:00 featureX.dat
-rw-r--r-- 1 c staff 135 Sep 8 10:00 featureY.dat

load command can load data from a file.

1
2
>> load featureX.dat
>> load('featureY.dat')

The data from file is now comed into matrices after load

1
2
3
4
5
6
7
8
9
10
11
12
13
14
>> featureX
featureX =

2104 3
1600 3
2400 3
1416 2
......

>> size(featureX)
ans =

27 2

Show variables

who/whos: show variables in memory currently.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>> who
Variables in the current scope:

A ans featureX featureY sz v w

>> whos % for more details
Variables in the current scope:

Attr Name Size Bytes Class
==== ==== ==== ===== =====
A 3x2 48 double
ans 1x2 16 double
featureX 27x2 432 double
featureY 27x1 216 double
sz 1x2 16 double
v 1x4 32 double
w 1x10000 80000 double

Total is 10095 elements using 80760 bytes

Clear variables

clear command can help us to clear variables that are no longer useful.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>> who
Variables in the current scope:

A ans featureX featureY sz v w

>> clear A % clear a variable
>> clear sz v w % clear variables
>> whos
Variables in the current scope:

Attr Name Size Bytes Class
==== ==== ==== ===== =====
ans 1x2 16 double
featureX 27x2 432 double
featureY 27x1 216 double

Total is 83 elements using 664 bytes
>> clear % clear all variables
>> whos
>>

Save data

Take a part of a vector.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>> v = featureY(1:5)
v =

3999
3299
3690
2320
5399

>> whos
Variables in the current scope:

Attr Name Size Bytes Class
==== ==== ==== ===== =====
featureX 27x2 432 double
featureY 27x1 216 double
v 5x1 40 double

Total is 86 elements using 688 bytes

Save data to disk: save file_name variable [-ascii]

1
2
3
4
5
>> save hello.mat v    % save as a binary format
>> ls
featureX.dat featureY.dat hello.mat
>> save hello.txt v -ascii; % save as a ascii txt
>>

Then we can clear it from memory and load v back from disk:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
>> clear v
>> whos
Variables in the current scope:

Attr Name Size Bytes Class
==== ==== ==== ===== =====
featureX 27x2 432 double
featureY 27x1 216 double

Total is 81 elements using 648 bytes

>> load hello.mat
>> whos
Variables in the current scope:

Attr Name Size Bytes Class
==== ==== ==== ===== =====
featureX 27x2 432 double
featureY 27x1 216 double
v 5x1 40 double

Total is 86 elements using 688 bytes

>>

Manipulate data

Get element from a matrix:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
>> A = [1, 2; 3, 4; 5, 6]
A =

1 2
3 4
5 6

>> A(3, 2) % get a element of matrix
ans = 6
>> A(2, :) % ":" means every element along that row/column
ans =

3 4

>> A(:, 1)
ans =

1
3
5

>> A([1, 3], :) % get the elements along row 1 & 3
ans =

1 2
5 6

Change the elements of a matrix:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
>> A = [1, 2; 3, 4; 5, 6]
A =

1 2
3 4
5 6

>> A(:, 2) = [10, 11, 12]
A =

1 10
3 11
5 12

>> A(1, 1) = 0
A =

0 10
3 11
5 12

>> A = [A, [100; 101; 102]] % append another column vector to right
A =

0 10 100
3 11 101
5 12 102

>> A = [1, 2; 3, 4; 5, 6]
A =

1 2
3 4
5 6

>> B = A + 10
B =

11 12
13 14
15 16

>> C = [A, B]
C =

1 2 11 12
3 4 13 14
5 6 15 16

>> D = [A; B];
>> size(D)
ans =

6 2

Put all elements of a matrix into a single column vector:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>> A
A =

0 10 100
3 11 101
5 12 102

>> A(:) % put all elements of A into a single vector
ans =

0
3
5
10
11
12
100
101
102

Computing on Data

Element-wise operations

Use .<operator> instead of <operator> for element-wise operations (i.e. operations between elements).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
>> A = [1, 2; 3, 4; 5, 6];
>> B = [11, 12; 13, 14; 15, 16];
>> C = [1 1; 2 2];
>> v = [1, 2, 3];
>> A .* B % element-wise multiplication (ans = [A(1,1)*B(1,1), A(1,2)*B(1,2); ...])
ans =

11 24
39 56
75 96

>> A .^ 2 % squaring each element of A
ans =

1 4
9 16
25 36

>> 1 ./ A
ans =

1.00000 0.50000
0.33333 0.25000
0.20000 0.16667

>> v .+ 1 % equals to `v + 1` & `v + ones(1, length(v))`
ans =

2 3 4

Element-wise comparison:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
>> a
a =

1.00000 15.00000 2.00000 0.50000

>> a < 3
ans =

1 0 1 1

>> find(a < 3) % to find the elements that are less then 3 in a, return their indices
ans =

1 3 4

>> A
A =

1 2
3 4
5 6

>> [r, c] = find(A < 3)
r =

1
1

c =

1
2

Functions are element-wise:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
>> v = [1, 2, 3]
v =

1 2 3

>> log(v)
ans =

0.00000 0.69315 1.09861

>> exp(v)
ans =

2.7183 7.3891 20.0855

>> abs([-1, 2, -3, 4])
ans =

1 2 3 4

>> -v % -1 * v
ans =

-1 -2 -3

Floor and Ceil of elements:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>> a
a =

1.00000 15.00000 2.00000 0.50000

>> floor(a)
ans =

1 15 2 0

>> ceil(a)
ans =

1 15 2 1

Matrix operations

Matrix multiplication:

1
2
3
4
5
6
7
8
>> A = [1, 2; 3, 4; 5, 6];
>> C = [1 1; 2 2];
>> A * C % matrix multiplication
ans =

5 5
11 11
17 17

Transpose:

1
2
3
4
5
6
7
>> A = [1, 2; 3, 4; 5, 6];
>> A' % transposed
ans =

1 3 5
2 4 6

Get the max element of a vector | matrix:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
>> a = [1 15 2 0.5];
>> A = [1, 2; 3, 4; 5, 6];
>> max_val = max(a)
max_val = 15
>> [val, index] = max(a)
val = 15
index = 2
>> max(A) % `max(<Matrix>)` does a column-wise maximum
ans =

5 6
>> max(A, [], 1) % max per column
ans =

5 6

>> max(A, [], 2) % max per row
ans =

2
4
6

>> max(max(A)) % the max element of whole matrix
ans = 6
>> max(A(:))
ans = 6

Sum & prod of vector:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
>> a
a =

1.00000 15.00000 2.00000 0.50000

>> A
A =

1 2
3 4
5 6

>> sum(a)
ans = 18.500
>> sum(A)
ans =

9 12

>> sum(A, 1)
ans =

9 12

>> sum(A, 2)
ans =

3
7
11

>> prod(a)
ans = 15
>> prod(A)
ans =

15 48

Get the diagonal elements:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
>> A = magic(4)
A =

16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1

>> A .* eye(4)
ans =

16 0 0 0
0 11 0 0
0 0 6 0
0 0 0 1

>> sum(A .* eye(4))
ans =

16 11 6 1

>> flipud(eye(4)) % flip up down
ans =

Permutation Matrix

0 0 0 1
0 0 1 0
0 1 0 0
1 0 0 0

>> sum(A .* flipud(eye(4)))
ans =

4 7 10 13

Inverse:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
>> A = magic(3)
A =

8 1 6
3 5 7
4 9 2

>> pinv(A)
ans =

0.147222 -0.144444 0.063889
-0.061111 0.022222 0.105556
-0.019444 0.188889 -0.102778

>> pinv(A) * A % get identity matrix
ans =

1.0000e+00 2.0817e-16 -3.1641e-15
-6.1062e-15 1.0000e+00 6.2450e-15
3.0531e-15 4.1633e-17 1.0000e+00

Plotting Data

Plotting a function

1
2
3
4
5
6
7
8
9
>> clear
>> t = [0:0.01:0.98];
>> size(t)
ans =

1 99

>> y1 = sin(2*pi*4*t);
>> plot(t, y1);

It will show you a figure like this:

image-20190908153422239

1
2
>> y2 = cos(2*pi*4*t);
>> plot(t, y2);

👆 This will replace the sin figure with a new cos figure.

If we want to have both the sin and cos plots, the hold on command will help:

1
2
3
>> plot(t, y1);
>> hold on;
>> plot(t, y2, 'r');

We can set some text on thw figure:

1
2
3
4
>> xlabel("time");
>> ylabel("value");
>> legend('sin', 'cos'); % Show what the 2 lines are
>> title('my plot');

Now, we get this:

myPlot

Then, we save it and close the plotting window:

1
2
>> print -dpng 'myPlot.png'    % save it to $(pwd)
>> close

We can show two figures at the same time:

1
2
>> figure(1); plot(t, y1);
>> figure(2); plot(t, y2);

Then, we can also generate figures like this:

image-20190908155218094

What we need to do is using a subplot:

1
2
3
4
5
>> subplot(1, 2, 1);    % Divides plot a 1x2 grid, access first element
>> plot(t, y1);
>> subplot(1, 2, 2);
>> plot(t, y2);
>> axis([0.5, 1, -1, 1]) % change the range of axis

Use clf to clear a figure:

1
>> clf;

Showing a matrix

1
2
3
4
5
6
7
8
9
10
>> A = magic(5)
A =

17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9

>> imagesc(A), colorbar

It gives us a figure like this:

image-20190908160029068

The different colors correspond to the different values.

Another example:

1
2
>> B = magic(10);
>> imagesc(B), colorbar, colormap gray;

Output:

image-20190908160540485

Contriol Statements

for

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
>> v = zeros(10, 1)
v =

0
0
0
0
0
0
0
0
0
0

>> for i = 1: 10,
> v(i) = 2^i;
> end;
>> v'
ans =

2 4 8 16 32 64 128 256 512 1024

while

1
2
3
4
5
6
7
8
9
10
>> i = 1;
>> while i <= 5,
> v(i) = 100;
> i = i + 1;
> end;
>> v'
ans =

100 100 100 100 100 64 128 256 512 1024

if

1
2
3
4
5
6
7
8
9
>> for i = 1: 10,
> if v(i) > 100,
> disp(v(i));
> end;
> end;
128
256
512
1024

Or, we can program like this,

1
2
3
4
5
6
7
8
x = 1;
if (x == 1)
disp ("one");
elseif (x == 2)
disp ("two");
else
disp ("not one or two");
endif

break & continue

1
2
3
4
5
6
7
8
i = 1;
while true,
v(i) = 999;
i = i + 1;
if i == 6,
break;
end;
end;

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
v =

999
999
999
999
999
64
128
256
512
1024

Function

Create a Function

To create a function, type the function code in a text editor (e.g. gedit or notepad), and save the file as functionName.m

Example function:

1
2
3
4
function y = squareThisNumber(x)

y = x^2;

To call this function in Octave, do either:

  1. cd to the directory of the functionName.m file and call the function:
1
2
3
4
5
% Navigate to directory:
cd /path/to/function

% Call the function:
functionName(args)
  1. Add the directory of the function file to the load path:
1
2
3
4
5
% To add the path for the current session of Octave:
addpath('/path/to/function/')

% To remember the path for future sessions of Octave, after executing addpath above, also do:
savepath

Function with multiple return values

Octave’s functions can return more than one value:

1
2
3
4
5
function [square, cube] = squareAndCubeThisNumber(x)

square = x^2;
cube = x^3;

1
2
3
>> [s, c] = squareAndCubeThisNumber(5)
s = 25
c = 125

Practice

Let’s say I have a data set that looks like this, with data points at (1, 1), (2, 2), (3, 3). And what I’d like to do is to define an octave function to compute the cost function J of theta for different values of theta.

image-20190908170150862

First, put the data into octave:

1
2
3
X = [1, 1; 1, 2; 1, 3]    % Design matrix
y = [1; 2; 3]
theta = [0; 1]

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
X =

1 1
1 2
1 3

y =

1
2
3

theta =

0
1

Then define the cost function:

1
2
3
4
5
6
7
8
9
10
11
12
13
% costFunctionJ.m

function J = costFunctionJ(X, y, theta)

% X is the *design matrix* containing our training examples.
% y is the class labels

m = size(X, 1); % number of training examples
predictions = X * theta; % predictions of hypothesis on all m examples
sqrErrors = (predictions - y) .^ 2; % squared erroes

J = 1 / (2*m) * sum(sqrErrors);

Now, use the costFunctionJ:

1
2
>> j = costFunctionJ(X, y, theta)
j = 0

Got j = 0 because we set theta as [0; 1] which is fitting our data set perfectly.

Vectorization

Vectorization is the process of taking code that relies on loops and converting it into matrix operations. It is more efficient, more elegant, and more concise.

As an example, let’s compute our prediction from a hypothesis. Theta is the vector of fields for the hypothesis and x is a vector of variables.

With loops:

1
2
3
4
prediction = 0.0;
for j = 1:n+1,
prediction += theta(j) * x(j);
end;

With vectorization:

1
prediction = theta' * x;

If you recall the definition multiplying vectors, you’ll see that this one operation does the element-wise multiplication and overall sum in a very concise notation.