The lexical structure defines the set of basic building blocks of the Kisumu programming language. These elements include characters, tokens, and rules for constructing identifiers, keywords, and other syntactic components.
Kisumu uses the Unicode character set, supporting a wide range of characters from various languages and scripts. The standard encoding is UTF-8 to ensure compatibility and flexibility.
Tokens are the smallest units of the languageās syntax. Kisumu supports the following token types:
Identifiers are names used to represent variables, functions, and other user-defined elements. They must begin with a letter or an underscore (_
) and can be followed by letters, digits, or underscores. Identifiers are case-sensitive.
Examples:
validName
_variable1
myFunction
Keywords are reserved words with specific meanings in the language. They cannot be used as identifiers. The initial set of keywords includes:
int, float, string, bool, func, return, if, else, for, while, break, continue, true, false
Literals represent fixed values in the code, including:
42
, 0
, -7
)3.14
, -0.001
)"Hello, Kisumu!"
)true
, false
)Operators are symbols used to perform operations on data. Examples include:
+
, -
, *
, /
, %
==
, !=
, <
, >
, <=
, >=
&&
, ||
, !
Punctuation tokens are used for syntax structuring. Examples include:
,
(
and )
{
and }
[
and ]
Comments are ignored by the compiler and are used to annotate code. Kisumu supports:
//
and extend to the end of the line./*
and */
.Examples:
// This is a single-line comment
/* This is a
multi-line comment */
Whitespace characters (spaces, tabs, and newlines) are generally ignored by the compiler but are used to separate tokens.