Interface-Driven Lexical Learner (Draft)

The Author’s apology for his draft

This post is currently in draft form. This means that you should take nothing in the following paragraphs as true, verified, original, intentional, or edifying. This blog post is provided ‘as is’, without warranty of any kind, expressed or implied. Please put no more faith in it than you would an essay produced by a typewriter-wielding monkey. I will eventually finish this post properly, and once that happens, I will stand behind my words fully.

I am publishing this post now because having something publicly accessible helps to keep me honest, and provides some much-needed encouragement to actually see the work to completion. Citations, verifications, and elucidations to follow.

Never solve a deterministic problem probablistically

XOR is one of the standard bitwise operators, and is probably one of the first logic gates you will learn in any CS class. It is so important that it’s been part of Intel’s instruction sets since the 8008. In just about any programming language, you can express it with just one operator:

 1#binary 0
 2a = 0b0
 3
 4#binary 1
 5b = 0b1
 6
 7#Prints 1
 8print(a ^ b)
 9
10#Prints 0
11print(a ^ a)
12print(b ^ b)

XOR is easy to understand, and completely deterministic (0 ^ 0 will always be 0, and 0 ^ 1 will always be 1). But imagine someone put the XOR gate in a black box, so that you could only see the inputs and outputs, and they asked you to model its effects. You might implement something like this:

 1import numpy as np
 2
 3w1 = np.array([[1, 1], [1, 1]])
 4b1 = np.array([0, -1])
 5
 6w2 = np.array([[1], [-2]])
 7b2 = np.array([0])
 8
 9
10def relu(z1):
11    return np.maximum(0, z1)
12
13
14def xor(x):
15    x = np.array(x)
16    y = relu(x @ w1 + b1) @ w2 + b2
17    return y[0]
18
19#Prints 0
20print(xor([1, 1]))
21
22#Prints 1
23print(xor([0, 1]))

That is a multi-layer perceptron (MLP), a very simple neural network. XOR is deterministic so we can expect that over enough training passes, our MLP will eventually look something like the model above. This model perfectly approximates XOR, but it is obviously inferior. We have replaced a simple logic gate with over a dozen floating point operations, and traded 0 ^ 0 = 0 for 0 ^ 0 ≈ 0. In this case the unknown nature of the modeled function made that tradeoff necessary, but in any other situation it would be a ridiculous exercise in overengineering.

What is this project?

The Interface-Driven Lexical Learner (IDLL) is an attempt to

For now that’s all folks!

#Linguistics #Drafts