深入解析浏览器解析机制

一个HTML解析器作为一个状态机,它从输入流中获取字符并按照转换规则转换到另一种状态。在解析过程中,任何时候它只要遇到一个'<‘符号(后面没有跟’/’符号)就会进入“标签开始状态(Tag open state)”。然后转变到“标签名状态(Tag name state)”,“前属性名状态(before attribute name state)”……最后进入“数据状态(Data state)”并释放当前标签的token。当解析器处于“数据状态(Data state)”时,它会继续解析,每当发现一个完整的标签,就会释放出一个token。

动态规划算法解决查找最长回文问题

class Solution {
public:
string longestPalindrome(string s) {
//BOTTOM-TOP 动态规划实现查找最长回文串
//初始化备忘矩阵
const int sz = s.size();
bool**q = new bool*[sz];
for (int i = 0; i < sz; ++i){
q[i] = new bool[sz];
for (int j = 0; j < sz; ++j){
q[i][j] = false;
}
}
//开始自底部向顶部动态规划
//首先初始化回文串起始终止位置
int pstart = -1, pend = pstart;
for (int i = sz – 1; i >= 0; –i){
for (int j = i; j < sz; ++j){
//如果子串的首尾字符相同,并且其距离不大于2或者首尾之间的子子串在备忘中是true的,
//该字串肯定是一个回文
//cout << i << “->” << j<<“|”;
if (s[i] == s[j] && (j – i <= 2 || q[i+1][j-1])==true){
q[i][j] = true;
if ((j – i+1)>(pend – pstart)) {
pstart = i;
pend = j;
//cout << i << “->” << j << endl;
}
}
}//end for

}//enf for
delete[] q;
return s.substr(pstart, pend – pstart + 1);
}

};

leetcode9 [Palindrome Number] [Easy,C++]

题目来源:

https://leetcode.com/problems/palindrome-number/

题目描述:

Determine whether an integer is a palindrome. Do this without extra space.

Some hints:Could negative integers be palindromes? (ie, -1)

If you are thinking of converting the integer to string, note the restriction of using extra space.

You could also try reversing an integer. However, if you have solved the problem “Reverse Integer”, you know that the reversed integer might overflow. How would you handle such case?

There is a more generic way of solving this problem.

中文描述:

判断一个int是不是对称数。

限制计算空间。

解题思路:

根据对称数的判定特点,首先将int按中点截为2个子段。

然后对字段进行判断。

对字段的判断有多种方法,可以将其中一个逆转然后比较两个数的大小。这种方法要比下面的方法效率更高。

也可以按位判断,从一个段的正序与另一个段的逆序按位判断。

我的AC:

<1>正逆序按位比较法     100ms

bool isPalindrome(int x) {
if (x < 0) return false;
if (x == 0) return true;
unsigned pdlen = 0, half=0;
int jlen = x,header=0,tailer=0,prefix=0;
while (jlen){//首先求出int长度
++pdlen;
jlen /= 10;
}
half = pdlen / 2;//分割数据
prefix = pow(10, half);
header = pdlen & 0x1 ? x / pow(10, half + 1) : x / pow(10, half);
tailer = x – x / prefix*prefix;

while (half){//按位判断
int div = pow(10, half – 1);
int htop = header / div;
header -= htop*div;

int tbot = tailer % 10;
tailer /= 10;
if (htop != tbot) return false;
–half;
}
return true;
}

<2>单一逆序比较法   76ms

bool isPalindrome(int x) {
if (x < 0) return false;
if (x == 0) return true;
unsigned pdlen = 0, half=0;
int jlen = x,header=0,tailer=0,prefix=0;
while (jlen){
++pdlen;
jlen /= 10;
}
half = pdlen / 2;
prefix = pow(10, half);
header = pdlen & 0x1 ? x / pow(10, half + 1) : x / pow(10, half);
tailer = x – x / prefix*prefix;

//上面先求出数据长度

//下面将header逆转与tailer比较大小

//不逆转tailer的原因在于像00000012这样的tailer逆序的话不会算成21000000,而是21.而header逆转不会有这个问题。

int reverse_header = 0;
while (header){
int mod = header % 10;
header /= 10;
reverse_header = reverse_header * 10 + mod;
}
if (reverse_header != tailer) return false;
return true;
}

leetcode8 [String to Integer (atoi)] [Easy,C++]

题目来源:

https://leetcode.com/problems/string-to-integer-atoi/

题目描述:

Implement atoi to convert a string to an integer.

Hint: Carefully consider all possible input cases. If you want a challenge, please do not see below and ask yourself what are the possible input cases.

Notes: It is intended for this problem to be specified vaguely (ie, no given input specs). You are responsible to gather all the input requirements up front.

Update (2015-02-10):
The signature of the

C++

function had been updated. If you still see your function signature accepts a

const char *

argument, please click the reload button to reset your code definition.

spoilers alert… click to show requirements for atoi.

Requirements for atoi:The function first discards as many whitespace characters as necessary until the first non-whitespace character is found. Then, starting from this character, takes an optional initial plus or minus sign followed by as many numerical digits as possible, and interprets them as a numerical value.

The string can contain additional characters after those that form the integral number, which are ignored and have no effect on the behavior of this function.

If the first sequence of non-whitespace characters in str is not a valid integral number, or if no such sequence exists because either str is empty or it contains only whitespace characters, no conversion is performed.

If no valid conversion could be performed, a zero value is returned. If the correct value is out of the range of representable values, INT_MAX (2147483647) or INT_MIN (-2147483648) is returned.

中文描述:

实现atoi方法以将string转为一个int。

atoi的实现要求:

略过字符串开始的空格,直到遇到第一个非空格字符。

对从该非空格字符开始直到下一个空格或字符串结尾之间的子字符串进行处理。

在子字符串中从头开始捕获尽可能长的连续数字字符串。

如果没有该模式的字符串,返回0。

如果数字范围超过了int所能表示的范围,超过上界返回INT_MAX,低于下界返回INT_MIN.

解题思路:

本题考查对特殊情况的处理。

<1>int上下界的处理方法,比如要用long long类型的变量接收总和。并在总和计算循环中对上下界进行比较,防止最后的总和越过long long范围的情况发生。

<2>对于首个非空格字符为+ -两个字符的特殊考虑。

<3>对于字符开始的连续空格与计数开始之后出现的空格的处理

我的AC:

int myAtoi(string str) {
long long sum = 0;//用long long类型的变量接收总和,以防止可能的溢出。
const int max = 0x7fffffff, min = 0x80000000;//设置int的上下界
int flag = 1,sflag=0;
string::iterator is = str.begin();
while (is != str.end()){
char curChar = *is;
if (isdigit(curChar)){
if (curChar==’0’&&sum==0) ++is;//如果是字符是0,并且sum=0,那么这个0前面没有别的数字,跳过该0往后计算
else {
sum = sum * 10 + curChar – ‘0’;//否则计算sum
if (sum*flag > max) return max;//及时判断是否越界
if (sum*flag < min) return min;
++is;
}
sflag = 1;
}
else if (isspace(curChar)){//对空格的处理
if (sflag) break;//如果空格出现在计数数字之后,停止循环
++is;//如果空格出现在字符串最前面,继续循环
}
else if (curChar == ‘-‘ || curChar == ‘+’){//对+-号的处理
string::iterator is_tail = is + 1;
if (is_tail != str.end()){
if (isdigit(*is_tail)) {
if (curChar==’-‘) flag = -1;
}
else break;
}
++is;
}
else break;
}

return sum*flag;
}

leetcode7 [Reverse Integer] [Easy,C++]

题目来源:

https://leetcode.com/problems/reverse-integer/

题目描述:

Reverse digits of an integer.

Example1: x = 123, return 321
Example2: x = -123, return -321

Have you thought about this?Here are some good questions to ask before coding. Bonus points for you if you have already thought through this!If the integer’s last digit is 0, what should the output be? ie, cases such as 10, 100.Did you notice that the reversed integer might overflow? Assume the input is a 32-bit integer, then the reverse of 1000000003 overflows. How should you handle such cases?For the purpose of this problem, assume that your function returns 0 when the reversed integer overflows.Update (2014-11-10):
Test cases had been added to test the overflow behavior.中文解释:将整形数字逆转输出,符号位不变。如果出现数据溢出,返回0;

自己的解题思路:

这道题很简单,考察的地方在于:

<1>int的范围边界

<2>c++中求余运算时的结果符号问题

==================================================================

#include “stdafx.h”
#include <string>
#include <vector>
#include <iostream>
using namespace::std;
#pragma warning(disable:4146)
class Solution {
public:
int reverse(int x) {
long long v = 0;
int max = 0x7fffffff, min = 0x80000000;//注意int最小负数的表示 1000 0000 0000 0000 0000 0000 0000 0000B===>80000000H
while (x){
int tmp = x % 10;//求余,其符号跟随x
v = v * 10 + tmp;//当前结果*10,然后累加余数
cout << v << endl;
if (v > max || v < min) return 0;//如果溢出,返回0
x = x / 10;
}

return v;
}
};

==================================================================

 

 

leetcode-6 [ZigZag Conversion] [Easy,C++]

题目描述:

The string

"PAYPALISHIRING"

is written in a zigzag pattern on a given number of rows like this: (you may want to display this pattern in a fixed font for better legibility)

P   A   H   N
A P L S I I G
Y   I   R

And then read line by line:

"PAHNAPLSIIGYIR"

Write the code that will take a string and make this conversion given a number of rows:

string convert(string text, int nRows);
convert("PAYPALISHIRING", 3)

should return

"PAHNAPLSIIGYIR"

.

中文:”zigzag”模式将如“PAYPALISHIRING”的一串数字如下组织:

P   A   H   N
A P L S I I G
Y   I   R
相当于波浪线,上下起伏。
=============================================================================
1           7             d
2       6   8        c    e
3   5       9    b        f
4           a             g

==============================================================================
自己的解法:

&lt;1&gt;上图每两条竖线之间的区域相当于一个n*n的n维矩阵。
&lt;2&gt;可以观察到如果矩阵同一行的元素是在竖线上的,那么他们之间的差值对于每一行来说都是等量的。那么这个差值是多少呢?
   观察第一行的1 7,第二行的2 8。可以发现差值点之间会经过一条对角线,两条竖线的部分。而两条竖线的部分恰好是矩阵的一条边长。
   那么去除重复点可以归纳出这个差值就是2*n-2=2*(n-1)。
&lt;3&gt;同时还会发现,同一行的两条竖线间最多会出现一个元素。该元素在数串中的位置也是有规律的。观察3与5之间经过的路径,2与6经过的路径会发现,他们之间的差值与左侧竖线元素的行数有很大的联系。2出现在第二行,当由2至6时,先向下经过了2个位置,又向右经过了2个位置到达6.再看3到5,3在第三行,3到5是向下一个向右1个位置。此时会得出结论,这个差值就是总行数与当前行的差值的1倍,即(numRows-currentRow)*2.
基于&lt;2&gt;&lt;3&gt;的归纳,编码如下:
&lt;code&gt;
string convert(string s,int numRows){
 int sz = s.size();
 string ret;
 int row = 1, pivot = 0;
 while (row &lt;= numRows){
 vector&lt;char&gt; v;
 int next_pivot;
 int b_pivot = pivot;
 while (b_pivot &lt; sz){
 v.push_back(s[b_pivot]);//存储当前元素
//此处有个例外,如果只有一行,那么下一个竖线的节点应该是当前节点直接+1
//否则要按照差值计算
 if (numRows==1) next_pivot = 1 + b_pivot;//计算下一个竖线中在该行的节点
 else next_pivot = 2 * (numRows - 1) + b_pivot;

 int inside = next_pivot - 2 * (row-1);//计算两条竖线间属于该行的节点
 if (inside != next_pivot&amp;&amp;inside != b_pivot&amp;&amp;inside&lt;sz){//如果竖线间存在节点,加入vector
 v.push_back(s[inside]);
 }
 b_pivot = next_pivot;
 }
 //输出该行节点
 for (vector&lt;char&gt;::iterator vi = v.begin(); vi != v.end(); ++vi){
 ret.push_back(*vi);
 }
 cout &lt;&lt; endl;
 ++pivot;
 ++row;
 }
 return ret;
 }
&lt;/code&gt;